Skip to article frontmatterSkip to article content

High-Performance Data Science with Modern C++

The Executable Book

Ontario Tech University

GitHub License Binder


This is the home of an executable book project about using Modern C++ for high-performance data science.

It’s a companion to a series of talks by Armin Sobhani for the Compute Ontario Colloquia.

It’ll be updated as more talks in the series are delivered.

C++ vs. Python for Data ScienceΒΆ

😌 Ease of Use

  • C++ has a steeper learning curve and more complex syntax compared to Python πŸ‘Ž
  • Python’s syntax is simple and readable, making it accessible for beginners πŸ‘

Winner: PythonπŸ₯‡

πŸ“š Community and Libraries

  • C++'s ecosystem is not as extensive as Python’s for data science πŸ‘Ž
  • Python has extensive libraries like NumPy, Pandas, Matplotlib, etc. and a large and active community πŸ‘

Winner: PythonπŸ₯‡

πŸƒ Performance

  • C++ is known for its high performance and efficiency πŸ‘
  • Python is generally slower than C++ due to its interpreted nature πŸ‘Ž

Winner: C++πŸ₯‡

πŸ”€ Concurrency

Winner: C++πŸ₯‡

πŸ’Ό Memory Management

  • C++ offers fine-grained control over memory management, which can be crucial for large-scale data processing πŸ‘
  • Python offers less control compared to C++ πŸ‘Ž

Winner: C++πŸ₯‡

πŸ’« Rapid Prototyping

  • C++'s compiled nature makes it a lackluster πŸ‘Ž
  • Python’s interpreted nature combined with Project Jupyter makes it a perfect match for the job πŸ‘

Winner: PythonπŸ₯‡