Content Data Analyst

Data Analyst, web crawling, web scraping

The Company

Diligent is the world’s largest GRC SaaS provider, serving nearly 1 million users from 25,000 organizations around the world. Our software enables holistic and informed conversations about governance, risk and compliance and ensures CEOs, CFOs and the board have an integrated view of audit, risk, information security, ethics, and compliance from across the organization.

Our world-changing idea is to bring technology, insights, and confidence to leaders so they can build more effective, equitable, and successful organizations – and create lasting, positive impact on the world. We seek to empower organizations to be better for their stakeholders and communities, for their customers and employees, for their bottom line.

Headquartered in New York, Diligent also has offices in Washington D.C., London, Galway, Budapest, Vancouver, Bengaluru, Munich, and Sydney.

Position Overview

We are looking for talented and smart individuals who are passionate about using the latest web scraping technology to extract data from websites and to empower our customers to use it to make significant business decisions. The candidate will support the team’s scraping and production system needs.

Our robust web scraping processes and our product delivery services need to be highly available and highly scalable while simultaneously running multiple data operations processes. Finally, the ideal candidate will understand the critical importance of delivering seamless, high quality customer service.


  • Design and develop stable web crawling and scraping strategies for individual webpages, with a focus on performance and accuracy using our web scraping tools
  • Test the acquired data to ensure accuracy and quality and rectify any issues with breaks as well as performance as needed
  • Work with senior manager of content to identify best practices for using web scraping tools to collect relevant data
  • Identify and recommend internal process improvements
  • As a secondary responsibility, create Boolean searches to run in SOLR by researching subject areas to identify the relevant/significant terms to include in the search
  • Follow process for entering complete data into Manzama Django interface


  • 2+ years of web crawling/ scraping experience
  • Expertise with web-scraping tools, including Mozenda and Diffbot
  • Proficiency using X-Paths
  • Proficiency using RegEx
  • Experience in complex crawling (like captcha, Mobile OTP based crawling, bypassing proxy)
  • Experience conducting large scale web crawling and scraping
  • Experience in various data extraction methods (like data extraction from PDF Files, webpages, etc)
  • Sound Knowledge in Bot Management Techniques
  • Experience working with data APIs and data feeds
  • High attention to detail and passion for quality of work
  • Understanding of Ajax and JQuery is a plus
  • Experience using SOLR
  • Interest in data science
Fai clic qui per accedere alla Privacy Policy di Hays, che fornisce informazioni dettagliate su come utilizziamo e proteggiamo le tue informazioni personali e sui tuoi diritti a riguardo.


Tipologia di lavoro
Tempo indeterminato
Information Technology
Sede di lavoro
Software Development

Consulente di riferimento

Il Consulente Borbely Livia, è il nostro esperto che gestisce questa opportunità di lavoro, con sede a Budapest
Budapest, Szabadság tér 7, Bank Center

Telefono: +36705186607