Interview #4: Inside Web Scraping – An Interview with Martin Ganchev, Smartproxy

On proxies, web scraping and pets

Welcome to our monthly interview, this time it’s the turn of Martin Ganchev, VP of Enterprise Partnerships at Smartproxy.

Martin Ganchev Smartproxy
Interview #4: Inside Web Scraping – An Interview with Martin Ganchev, Smartproxy 3

Hi Martin, thanks for joining us on The Web Scraping Club, I’m really happy to have you here.

First of all, tell us a bit about yourself and what brought you to Smartproxy.

My name is Martin and I’m the VP of Enterprise Partnerships at Smartproxy. I joined the company 1 year ago.

In my previous company, I worked on various software engineering projects. Some of them required vast amounts of data collection. We used several proxy providers, one of which was Smartproxy. The company made an impression with its customer support and the quality of service. Shortly after, I saw an opportunity to join Smartproxy, and I took it without hesitation. I wanted to be part of a fast-growing global tech company.

And what about Smartproxy, what does the company do and what’s the value that brings to your customers?

Smartproxy started as a fully self-service-based and product-led provider. The main goal was to sell residential and datacenter proxies and target an entry-level user segment. Things began to move in a different direction when the company got inquiries from several Fortune 500 companies. That was a clear indication that there was a higher potential. Since 2020, Smartproxy has offered not only residential and datacenter proxies but also web scraping infrastructure that uses proxies as leverage. Currently, we have an eCommerce Scraping API (it can target Amazon, Wayfair, Aliexpress, eBay, etc.), SERP Scraping API, Web Scraping API (it scrapes data by URL), and a No-Code Scraper that doesn’t require any coding knowledge.

We source our IPs ethically and focus on the quality of our proxies, clean infrastructure, and round-the-clock, professional tech support. The main difference between other proxy providers and us is the quality of service we offer to our clients.

As a major player in the Proxy industry, how do you see it from the inside? Is the industry growing, following the increasing need for web data? Do you feel there are incumbent threats from a regulatory/privacy perspective?

The industry is ever-growing, and we spot new proxy providers almost daily. However, we clearly understand that new players, the small ones, are not fully capable of accommodating large amounts of IPs, so they leverage big proxy companies whose pools they resell.

That’s why we’re always happy to collaborate with other proxy companies, sharing and exchanging knowledge and support with each other. We believe that there is room for everyone on the proxy market.

One of my big curiosities about the proxy industry: how does it work in the IPs market? Where do you find IPs to sell/buy? How can you say you’re buying a good one?

There are several ways to procure IPs. Smaller companies that typically just enter the market rely on reselling the infrastructure of bigger and more established proxy providers. The bigger and more established proxy providers typically leverage their in-house procurement team, which is responsible for supplying new IPs. In the case of Smartproxy, we acquire our exit nodes from various providers. Some of them are wholesalers who receive their exit nodes from application owners. It is important to note that we require these providers to ensure that end users are reasonably informed and have consented to such use of their devices.

Regarding the QA of the IPs, our supply and engineering teams constantly work on testing the infrastructure. That includes in-house stability checks, IPs quality tests – speed, location in various databases, measurement, and stress testing, after which our internal algorithm selects the highest quality proxies. In addition, we are always open to hearing our client’s feedback regarding proxy quality.

What are the features a customer should have a look at when choosing its proxy providers? Diversification? Reputation?

When selecting a proxy provider, each client has their own expectations. Diversification is one of the main reasons why so many providers exist. We always advise clients that they don’t necessarily need to rely on one proxy provider. And in most cases, companies prefer to have multiple providers when the traffic is high. That’s the beauty of our industry – there is a place and market for each provider.

In terms of reputation – definitely, yes. It’s a crucial factor; however, low reputation and quality usually come with lower prices. Unfortunately, quality comes with a price. Focusing on the infrastructure requires attention and resources.

I believe that tech support and communication matter a lot. When a client has a question, they want it to be answered straight away. Regardless of the time zone, day of the week, etc. At Smartproxy, we have 24/7 live chat technical support. It means that regardless of the day and time, there’s always a professional available to chat with.

Of course, there are other factors like geolocation coverage, speed, product variety, etc. But that’s specific to each client’s use case.

I’ve seen that your second business pillar is your API collection for web scraping. Can you describe more about the services?

With the growth of our customer base, we noticed that we’re serving a big part of the web scraping market. That allowed us to take the strategic step forward and focus our activities on developing ready-made scraping APIs by leveraging our residential and datacenter proxies.

Our product portfolio consists of several APIs. The SERP Scraping API is used for scraping Google and other search engines. eCommerce Scraping API is used primarily for scraping major marketplaces – Amazon, Wayfair, Aliexpress.

In addition, we have our Web Scraping API, which can scrape any website by URL. The fourth scraping solution we offer is our brand-new No-Code Scraper which allows clients to gather data without writing a single line of code. All scrapers include our residential proxies, offer a 100% success rate, and output in JSON or HTML (Web Scraping API supports only HTML).

There is yet another API cooking in Smartproxy’s kitchen at the moment – there will be more details about it soon. All I can say is that no other proxy provider has successfully managed to develop something like that (so far).

Liked the article? Subscribe for free to The Web Scraping Club to receive twice a week a new one in your inbox.

Are you planning to expand in the data-providing business in the future?

That’s a really good question. For now, our complete focus is on maintaining the stability and quality of our proxy pool and strengthening our position as a full-stack data collection infrastructure provider, helping various businesses to unlock web potential. Only time will show the next strategic step on our end.

In recent market research about the web scraping industry, I’ve seen that 75% of the expense in web scraping is in internal projects inside companies, instead of buying external pre-web-scraped data. In your opinion, why does it happen? Are the companies selling datasets missing something? Or web scraping is only the first step of a long value chain that needs to integrate industry expertise before being ready to use?

That’s really interesting data, Pierluigi. I believe that companies that do it in-house prefer the convenience of having complete authority over the data extraction process. They can control the whole data cycle and, most importantly, keep track of its quality. It is also connected to the volumes of data needed and whether the external data company can support that. What we see with large e-commerce companies is that they need real-time data, so they definitely need to leverage their internal capabilities. On the other hand, for smaller companies that don’t have the resources to own a dedicated data scraping team, purchasing datasets makes total sense.

However, I’ve had cases with some of the biggest names in the retail industry who do both purchasing and internal scraping. But other large corporations are keen to develop an internal web scraping team for the convenience of managing the whole cycle of data without relying on external providers.

How do you see the web scraping tools industry in the future? 

We see more and more companies that are trying to develop their own scraping products. However, we also have clients who require a more sophisticated scraping solution that is custom to their case. Therefore, I believe that scraping tools will continue to pop out, and companies will take advantage of being able to scrape data without the need to maintain code on their end. On the other hand, when doing scraping work that requires customization, companies will continue to leverage their in-house teams.

Any fun facts about the early days you want to share?

On my first day at Smartproxy, I had a meeting with our team. It was an intro call, and the goal was to get to know everyone.

Mid-call, I heard someone screaming loudly, “SCRAPER, SCRAPER, SCRAPER.” I knew that people at Smartproxy are obsessed with proxies and scraping but didn’t acknowledge that it can be to that extent. It turned out that the guy shouting was one of our software engineers. His dog’s name is Scraper, and apparently, Scraper was being naughty, peeing on one of the flower pots in the flat.

After the “incident,” another team member showed his cat during the call. The cat’s name is Python

Liked the article? Subscribe for free to The Web Scraping Club to receive twice a week a new one in your inbox.