DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • What Developers Need to Know About Table Geo-Partitioning
  • How To Set Up a Scalable and Highly-Available GraphQL API in Minutes
  • Getting Started With Apache Cassandra
  • How to Hive on GCP Using Google DataProc and Cloud Storage: Part 1

Trending

  • Java's Quiet Revolution: Thriving in the Serverless Kubernetes Era
  • How to Build Local LLM RAG Apps With Ollama, DeepSeek-R1, and SingleStore
  • Build Your First AI Model in Python: A Beginner's Guide (1 of 3)
  • Grafana Loki Fundamentals and Architecture
  1. DZone
  2. Data Engineering
  3. Data
  4. How To Use Geo-Partitioning to Comply With Data Regulations and Deliver Low Latency Globally

How To Use Geo-Partitioning to Comply With Data Regulations and Deliver Low Latency Globally

Read a geo-partitioning guide to deploying a YugabyteDB Managed database cluster and optimizing data distribution across regions.

By 
Denis Magda user avatar
Denis Magda
DZone Core CORE ·
Jun. 02, 23 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
6.0K Views

Join the DZone community and get the full member experience.

Join For Free

In today's interconnected world, application users can span multiple countries and continents. Maintaining low latency across distant geographies while dealing with data regulatory requirements can be a challenge. The geo-partitioning feature of distributed SQL databases can help solve that challenge by pinning user data to the required locations.

So, let’s explore how you can deploy a geo-partitioned database cluster that complies with data regulations and delivers low latency across multiple regions using YugabyteDB Managed.

Deploying a Geo-Partitioned Cluster Using YugabyteDB Managed

YugabyteDB is an open-source distributed SQL database built on PostgreSQL. You can deploy a geo-partitioned cluster within minutes using YugabyteDB Managed, the DBaaS version of YugabyteDB.

Yugabyte DB: Create Cluster/Select regions

Getting started with a geo-partitioned YugabyteDB Managed cluster is easy. Simply follow the below: 

  1. Select the Multi-region Deployment option. When creating a dedicated YugabyteDB Managed cluster, choose the “multi-region” option to ensure your data is distributed across multiple regions.
  2. Set the Data Distribution Mode to “partitioned." Select the "partition by region" data distribution option so that you can pin data to specific geographical locations.
  3. Choose target cloud regions. Place database nodes in the cloud regions of your choice. In this blog, we spread data across two regions - South Carolina (us-east1) and Frankfurt (europe-west3).

Once you've set up a geo-partitioned YugabyteDB Managed cluster, you can connect to it and create tables with partitioned data.

Create a Geo-Partitioned Table

To demonstrate how geo-partitioning improves latency and data regulation compliance, let's take a look at an example Account table.

First, create PostgreSQL tablespaces that let you pin data to the YugabyteDB nodes in the USA (usa_tablespace) or in Europe (europe_tablespace):

SQL
 
CREATE TABLESPACE usa_tablespace WITH (
    replica_placement = '{"num_replicas": 3, "placement_blocks":
  [
    {"cloud":"gcp","region":"us-east1","zone":"us-east1-c","min_num_replicas":1},
    {"cloud":"gcp","region":"us-east1","zone":"us-east1-d","min_num_replicas":1},
    {"cloud":"gcp","region":"us-east1","zone":"us-east1-b","min_num_replicas":1}
  ]}'
);

CREATE TABLESPACE europe_tablespace WITH (
    replica_placement = '{"num_replicas": 3, "placement_blocks":
  [
    {"cloud":"gcp","region":"europe-west3","zone":"europe-west3-a","min_num_replicas":1},
    {"cloud":"gcp","region":"europe-west3","zone":"europe-west3-b","min_num_replicas":1},
    {"cloud":"gcp","region":"europe-west3","zone":"europe-west3-c","min_num_replicas":1}
  ]}'
);


  • num_replicas: 3 - Each tablespace requires you to store a copy of data across 3 availability zones within a region. This lets you tolerate zone-level outages in the cloud.

Second, create the Account table and partition it by the country_code column:

SQL
 
CREATE TABLE Account (
    id integer NOT NULL,
    full_name text NOT NULL,
    email text NOT NULL,
    phone text NOT NULL,
    country_code varchar(3)
)
PARTITION BY LIST (country_code);


Third, define partitioned tables for USA and European records:

SQL
 
CREATE TABLE Account_USA PARTITION 
OF Account (id, full_name, email, phone, country_code, 
            PRIMARY KEY (id, country_code))
FOR VALUES IN ('USA') TABLESPACE usa_tablespace;

CREATE TABLE Account_EU PARTITION 
OF Account (id, full_name, email, phone, country_code, 
            PRIMARY KEY (id, country_code))
FOR VALUES IN ('EU') TABLESPACE europe_tablespace;


  • FOR VALUES IN ('USA') - If the country_code is equal to the ‘USA’, then the record is automatically placed or queried from the Account_USA partition that is stored in the usa_tablespace (the region in South Carolina).
  • FOR VALUES IN ('EU') - Otherwise, if the record belongs to the European Union (country_code is equal to 'EU'), then it’s stored in the Account_EU partition from the europe_tablespace (the region in Frankfurt).

Now, let’s examine the read-and-write latency when a user connects from the United States.

Latency When Connecting From the United States

Let’s open a client connection from Iowa (us-central1) to a database node located in South Carolina (us-east1) and insert a new Account record:

SQL
 
INSERT INTO Account (id, full_name, email, phone, country_code) 
    VALUES (1, 'John Smith', '[email protected]', '650-346-1234', 'USA');


As long as the country_code is 'USA', the record will be stored on the database node from South Carolina. The write and read latency will be approximately 30 milliseconds because the client requests need to travel between Iowa and South Carolina.

Connection to US nodes


Next, let’s see what happens when we add and query an account with the country_code set to 'EU':

SQL
 
INSERT INTO Account (id, full_name, email, phone, country_code) 
    VALUES (2, 'Emma Schmidt', '[email protected]', '49-346-23-1234', 'EU'); 

SELECT * FROM Account WHERE id=2 and country_code='EU';


Since this account must be stored in a European data center and must be transferred between the United States and Europe, the latency increases. 

Latency increase

  • The latency for the INSERT (230 ms) is higher than for the SELECT (130 ms) because during the INSERT the record is replicated across three availability zones in Frankfurt.

The higher latency between the client connection in the USA and the database node in Europe signifies that the geo-partitioned cluster makes you compliant with data regulatory requirements. Even if the client from the USA connects to the US-based database node and writes/reads records of residents from the European Union, those records will always be stored/retrieved from database nodes in Europe.

Latency When Connecting From Europe

Let’s see how the latency improves if you open a client connection from Frankfurt (europe-west3) to the database node in the same region, and query the European record recently added from the USA:

Open a client connection from Frankfurt

This time the latency is as low as 3 milliseconds (vs. 130 ms when you queried the same record from the USA) because the record is stored in and retrieved from European data centers.

Adding and querying another European record also maintains low latency, as long as the data is not replicated to the United States.

SQL
 
INSERT INTO Account (id, full_name, email, phone, country_code) 
    VALUES (3, 'Otto Weber', '[email protected]', '49-546-33-0034', 'EU');

SELECT * FROM Account WHERE id=3 and country_code='EU';


Adding and querying another European record-

When accessing data stored in the same region, latency is significantly reduced. The result is a much better user experience while remaining compliant with data regulatory requirements.

Wrap Up

Geo-partitioning is an effective way to comply with data regulations and achieve global low latency. By deploying a geo-partitioned cluster using YugabyteDB Managed, it's possible to intelligently distribute data across regions, while maintaining high-performance querying capabilities.

YugabyteDB Cloud cluster Partition (database) sql Data management

Opinions expressed by DZone contributors are their own.

Related

  • What Developers Need to Know About Table Geo-Partitioning
  • How To Set Up a Scalable and Highly-Available GraphQL API in Minutes
  • Getting Started With Apache Cassandra
  • How to Hive on GCP Using Google DataProc and Cloud Storage: Part 1

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

OSZAR »