Introduction This article is also available on YouTube: Replication in distributed systems occurs when each piece of data has more than one copy and each copy is located on a separate node. There are a few reasons to adopt replication: To achieve the data redundancy and therefore allow a system...
Continue reading...Databases and Storages
Saving Spark Dataframe to Apache Cassandra using Datastax Spark Connector
Hi! Today we’re going to look at Datastax Spark Cassandra Connector. Topics that are covered in this video: Generating a test CSV dataset using Python; Creating a schema in Cassandra; Preparing Jupyter workbench; Reading CSV into a DataFrame; Writing the DataFrame to the Cassandra;
Continue reading...Apache Cassandra Write Path, Compaction and Use Cases in 3 Minutes
Over the last years, Apache Cassandra became one of the most popular NoSQL solutions for big data. It started back in 2008 as an open-sourced product from Facebook, became an Apache Incubator project in 2009, and graduated to a top-level in 2010. It is three-minutes-tech series, I’m Alex Sergeenko and...
Continue reading...