
Photo by Joseph Barrientos on Unsplash
Mercator is an open source web crawler, made to crawl the .be, .brussels and .vlaanderen zones monthly. It collects DNS records, web technologies, used TLS ciphers, SMTP parameters, VAT numbers and way more.
The crawler is built upon the concept of SQS, namely queues. The different modules can all be scaled individually, allowing a fast and complete crawl of the .be zone within a day. Its deployment target is Elastic Kubernetes Service (EKS) on AWS.
During my time at DNS Belgium, I helped to finish this project, including many maintenance and monitoring tasks. I also worked on closed-source additions to this codebase.