Inserting Data
Inserting Data
The INSERT command is used to add new rows of data into a Cassandra table. Unlike traditional SQL databases, Cassandra uses an upsert approach - if a row with the same primary key already exists, it will be overwritten.
Syntax
INSERT INTO [keyspace_name.]table_name
(column1, column2, column3, ...)
VALUES (value1, value2, value3, ...)
[USING option];
Basic Insert Examples
Simple Insert
INSERT INTO student (id, name, city)
VALUES (1, 'Paul', 'Texas');
INSERT INTO student (id, name, city)
VALUES (2, 'John', 'New York');
INSERT INTO student (id, name, city)
VALUES (3, 'Max', 'New Jersey');
Insert with All Columns
INSERT INTO employee (emp_id, first_name, last_name, department, salary, hire_date)
VALUES (101, 'Alice', 'Johnson', 'Engineering', 75000, '2023-01-15');
Insert Options
Using TTL (Time To Live)
Set data to automatically expire after a specified time in seconds:
INSERT INTO session_data (session_id, user_id, data)
VALUES ('abc123', 1001, 'session info')
USING TTL 3600; -- Data expires after 1 hour
Using Timestamp
Specify a custom timestamp for the insert:
INSERT INTO events (event_id, event_type, description)
VALUES (uuid(), 'login', 'User logged in')
USING TIMESTAMP 1234567890;
Combining TTL and Timestamp
INSERT INTO cache_data (key, value)
VALUES ('user:1001', 'cached data')
USING TTL 300 AND TIMESTAMP 1234567890;
Working with Different Data Types
UUID Values
INSERT INTO users (user_id, username, created_at)
VALUES (uuid(), 'newuser', toTimestamp(now()));
Collections
-- List
INSERT INTO user_interests (user_id, interests)
VALUES (1, ['reading', 'gaming', 'coding']);
-- Set
INSERT INTO user_tags (user_id, tags)
VALUES (1, {'developer', 'gamer', 'reader'});
-- Map
INSERT INTO user_settings (user_id, preferences)
VALUES (1, {'theme': 'dark', 'language': 'en', 'notifications': 'on'});
JSON Insert
Cassandra supports inserting data as JSON:
INSERT INTO users JSON '{"user_id": 1, "username": "john_doe", "email": "[email protected]"}';
Conditional Inserts
IF NOT EXISTS
Insert only if the row doesn't already exist:
INSERT INTO users (user_id, username, email)
VALUES (1, 'john_doe', '[email protected]')
IF NOT EXISTS;
This returns a boolean indicating success or failure.
Batch Inserts
For inserting multiple rows efficiently:
BEGIN BATCH
INSERT INTO student (id, name, city) VALUES (4, 'Emma', 'Chicago');
INSERT INTO student (id, name, city) VALUES (5, 'Oliver', 'Seattle');
INSERT INTO student (id, name, city) VALUES (6, 'Sophia', 'Boston');
APPLY BATCH;
Important Notes
- Primary Key Constraint: You cannot insert a row without providing all primary key columns
- Upsert Behavior: INSERT will overwrite existing rows with the same primary key
- No Foreign Key Checks: Cassandra doesn't enforce referential integrity
- Timestamp: Every insert has an associated timestamp used for conflict resolution
- Performance: Cassandra is optimized for writes, making inserts very fast
Common Errors and Solutions
Missing Primary Key
-- Error: Primary key column 'id' is required
INSERT INTO student (name, city) VALUES ('Jane', 'Miami');
-- Correct:
INSERT INTO student (id, name, city) VALUES (7, 'Jane', 'Miami');
Invalid Data Type
-- Error: Invalid data type for column
INSERT INTO student (id, name, city) VALUES ('abc', 'Jane', 'Miami');
-- Correct:
INSERT INTO student (id, name, city) VALUES (8, 'Jane', 'Miami');