Scraping Python vs Rust

Introduction

Web scraping is about as error-prone as you can imagine. Pages might not exist, HTML elements might not always be there… And so, a language that can support errors and edge cases well at runtime and not crash is a huge plus.

Performance

Performance test of scraping the 50 pages of http://books.toscrape.com/catalogue/page-1.html

NameCPU UsageTime(s)
Synchronous Python5%44.3s
Synchronous Rust7%55s
Async Python63%2.5s
Async Rust107%2.25s

‌ Performances are pretty similar for such low level of requests. Time is consumed downloading. Maybe with significantly more requests, bigger difference would be seen.


This blog was originally published on: https://able.bio/haixuanTao/web-scraper-python-vs-rust--d6176429