Back to Blog
Guide2026-04-0311 min

The 7 Best Sample Databases for Learning SQL in 2026 (and Their Limitations)

A honest comparison of AdventureWorks, Northwind, Chinook, Sakila, and newer alternatives. What each is good for and where they fall short.

The Classic Problem

Every SQL tutorial starts with a sample database. But most sample databases were designed decades ago for a different era of software development. They're small, they lack realistic business complexity, and they don't reflect modern data engineering practices.

Here's an honest comparison of the most popular options — and where the field is heading.


1. AdventureWorks (Microsoft)

Released: 2005 (updated through 2019)

Tables: ~70 (many unused)

Rows: ~120K across all tables

Format: SQL Server backup (.bak)

Strengths

  • Well-documented with extensive tutorials
  • Covers manufacturing, sales, and HR domains
  • Included with SQL Server installations

Limitations

  • SQL Server only — not portable to PostgreSQL, MySQL, or SQLite without conversion
  • Outdated schema — designed for SQL Server 2005 patterns
  • No accounting — no double-entry bookkeeping, no journal entries, no trial balance
  • No tax compliance — no GST, VAT, PAYG, or payroll tax calculations
  • Complex but shallow — many tables exist but with minimal data
  • Not deterministic — no way to regenerate the same data

"AdventureWorks is the default recommendation, but it's a 20-year-old database designed to showcase SQL Server features, not to teach realistic business data modeling." — Common criticism in data engineering communities


2. Northwind (Microsoft)

Released: 1997

Tables: 13

Rows: ~3K

Format: SQL Server / Access (.mdb)

Strengths

  • Simple and easy to understand
  • Great for absolute beginners
  • Covers orders, products, customers, suppliers

Limitations

  • Extremely small — too little data for meaningful analytics
  • No financial data — no invoices, payments, or accounting
  • Single currency, single country — no multi-jurisdiction complexity
  • No payroll or HR — no employees beyond basic contact info
  • Frozen in 1997 — product catalog feels dated

3. Chinook (open source)

Released: 2008

Tables: 11

Rows: ~15K

Format: Multiple (SQLite, PostgreSQL, MySQL, SQL Server)

Strengths

  • Multi-platform — available for all major databases
  • Media/music domain (tracks, albums, artists, playlists)
  • Good for teaching joins and relationships

Limitations

  • Niche domain — media store, not a general business
  • Small — 11 tables is not enough for complex queries
  • No financial depth — invoices exist but no accounting behind them
  • No temporal complexity — no time-series patterns to analyze

4. Sakila (MySQL)

Released: 2005

Tables: 16

Rows: ~47K

Format: MySQL

Strengths

  • DVD rental domain — intuitive and fun
  • Good for practicing joins, subqueries, and aggregations
  • Well-structured with clear relationships

Limitations

  • MySQL only — designed for MySQL-specific features
  • Obsolete domain — DVD rentals in 2026?
  • No business operations — no purchasing, no payroll, no inventory management
  • Limited scale — not suitable for data engineering or BI workloads

5. TPC-H / TPC-DS (Transaction Processing Council)

Released: 1999 / 2006

Tables: 8 (TPC-H) / 25 (TPC-DS)

Rows: Scalable (1GB to 100TB)

Format: Generator tool

Strengths

  • Industry standard for benchmarking
  • Scalable — generate any size dataset
  • Well-defined queries — comes with standard benchmark queries
  • Complex star/snowflake schemas (TPC-DS)

Limitations

  • Designed for benchmarking, not learning — schema is abstract and unintuitive
  • No business realism — table names like LINEITEM, PARTSUPP don't map to real business concepts
  • No accounting or compliance — purely transactional
  • Difficult to set up — requires compilation and configuration

6. PostgreSQL Sample Databases (dvdrental, pagila)

Released: Various

Tables: 15-16

Rows: ~46K

Format: PostgreSQL

Strengths

  • Native PostgreSQL format
  • Active community maintenance
  • Good documentation

Limitations

  • Same domain limitations as Sakila (DVD rentals)
  • Small scale
  • No financial or operational depth

7. Mindweave SME-Sim Datasets (2026)

Released: 2026

Tables: 42

Rows: 39K - 259K per company (up to 825K in bundles)

Format: CSV, PostgreSQL SQL, Apache Parquet, SQLite

Strengths

  • 42 tables, 44 foreign keys — full end-to-end business operations
  • Double-entry accounting — debits always equal credits
  • Real tax compliance — ATO (AU), IRS (US), HMRC (UK) actual brackets
  • 3 countries × 3 industries — genuinely different business patterns
  • 4 formats — use with any database or analytics tool
  • Deterministic — same seed = identical data every time
  • Multi-company bundles — test group reporting and consolidation
  • Time-series rich — 730+ days of day-by-day simulation

Limitations

  • Not free (full datasets) — $19-$199 depending on product, but free samples available
  • SME focused — small/medium business, not enterprise or manufacturing
  • Three countries — AU, US, UK only (more planned)

Comparison Table

Database Tables Rows Accounting Tax Multi-format Deterministic Free
AdventureWorks ~70 ~120K No No SQL Server No Yes
Northwind 13 ~3K No No SQL Server No Yes
Chinook 11 ~15K No No Yes No Yes
Sakila 16 ~47K No No MySQL No Yes
TPC-H 8 Scalable No No Generator Yes Yes
TPC-DS 25 Scalable No No Generator Yes Yes
SME-Sim 42 39K-825K Yes Yes Yes (4) Yes Samples

Which Should You Use?

Learning basic SQL joins: Chinook or Sakila — small, simple, well-documented.

Learning business data modeling: SME-Sim — 42 tables with real business relationships and accounting.

Benchmarking database performance: TPC-H or TPC-DS — industry standard, scalable.

Testing ERP or accounting software: SME-Sim — the only option with double-entry accounting and real tax compliance.

Building BI dashboards: SME-Sim — 730+ days of time-series data across 7 business domains.

Data engineering pipelines: SME-Sim (Parquet format) or TPC-DS — both offer structured, scalable data.


Try It

Free samples are available on multiple platforms:

Browse full catalog →


*This comparison is based on publicly available documentation for each database as of April 2026. AdventureWorks and Northwind are trademarks of Microsoft Corporation. TPC-H and TPC-DS are trademarks of the Transaction Processing Performance Council.*

Ready to try production-realistic data?

42 tables, double-entry accounting, real tax compliance. Free samples available.