Ramin Naimi

Ramin's blog (ramin at ramin dot net) Your Ad Here

Sunday, January 25, 2009

Elements of a monitoring system



A good monitoring system is hard to find. There are plenty of tools/scripts/applications that provide a solution for a narrow use case. For example, someone wants to have a way to poll a service to see when it is not available, so they write a script to do that. There are hardly any systems that provide a good end-to-end solution for monitoring. This is the primary reason why we chose to develop our own monitoring system inside Yahoo. It was the only way we could provide a solution that was flexible enough to fit the diverse usage in Yahoo, while designed to leverage the way Yahoo service engineers operated. Scalability is a big factor for Yahoo and none of the existing solutions address scalability to the extend that satisfies Yahoo.

What are the various elements of a complete solution for monitoring?
They are (in no particular order):


* Data Collection

* Status Tracking

* Alert Generation

* Storage

* Configuration Management

* User Interface



You can argue the list is too short (or too long), but the purpose of the list is to capture the main areas (the main elements). Each area can be broken down to sub-areas. I'll cover each area in a little more depth to provide some clarity. Keep in mind that this topic can have so much details that a book can be devoted to it!

Tuesday, January 13, 2009

Monitoring System Components - Data Collection


This area covers the different ways that the system collects the metrics, status and other information about what it's monitoring. There are 3 basic types of data the monitoring system cares about:
(a) Metrics
(b) Status
(c) Configuration

I cover Data for Configuration in the Configuration Management section.
I cover Data for Status in the Status Tracking section.
Data for metrics can find its way into the monitoring system by the monitoring system either polling (pulling) the system/component being monitored, or by the target system pushing the data asynchronously into the monitoring system. A good monitoring system has the flexibility to provide both options as each fits a particular system behavior characteristics and requirements.


In the above diagram, the target system sends metrics to the monitoring system asynchronously (1). The monitoring system didn't initiate the request for metrics. This is typically done by either leveraging a scheduler in the target system, or some other trigger (e.g. metrics generated by activity on the target system). When the monitoring system initiates the metrics request (2), the scheduling of that request is performed on the monitoring system, and the request is sent to the target system. After the request is sent, the monitoring system (typically) waits for the response. This would ensure that the metrics that are returned in the payload are associated with that particular request (and more importantly with the timestamp of the request). In rare occasions, the monitoring system can send the request, and can accept the result asynchronously via method (1). This option is typically difficult to accommodate since the slight clock difference between the monitoring server and the target system can throw off the logic of associating the metrics to the real timestamp.

There are pros and cons of each option.

Sunday, January 11, 2009

Approaches to Monitoring Systems


Before I dive into the topic, let me give a brief overview. What's a monitoring system? Well, if you're developing or operating a complex online system, you want to know when things break (better yet, you want to know before things break that the system is about to break). A good monitoring system let's you understand the characteristics of your system by showing different types of data it has collected (e.g. how is your CPU performing over time, or is your server running out of disk space, or what's the average transaction rate of your system). Having various views of your system you can then realize where the bottlenecks and where the weak points of your system are. It will also be used as a tool to validate that your changes are addressing the real problem areas.

There are two approaches to monitoring a system:
1) Monitoring from outside-in
2) Monitoring from inside-out

Monitoring from outside-in

This approach is often taken by hosted monitoring services. Examples of these types of services are Gomez and Keynote. There are a large number of small players who offer similar synthetic checks of a URL.

Monitoring from inside-out


Monitoring your servers, systems, devices and services/applications extensively and thoroughly is the key to understanding how the organs of your system are behaving. This is a difficult and time consuming exercise. It requires a lot of research on picking the right tool(s) to gather the data (if it exists). If the data does not exist, get ready to spend even more time to instrument whatever component that is missing the data. Picking the right tool to gather and visualize the data is no easy task. Why? Well, because the monitoring systems process a lot of data, they have to be designed to address the most used use cases. For example, consider these two scenarios:
- monitoring system user wants to capture a small number of metrics but at a very high frequency (e.g. 50 metrics that are generated once every 1 second)
- monitoring system user wants to capture a large number of metrics but at a lower frequency (e.g. 3000 metrics that are generated once every 15 minutes)

In both of these examples, the monitoring system is handling 45000 metrics values every 15 minutes, but the approach to implement a solution for each can have a large impact on how efficiently it handles the volume of data.
I will concentrate the rest of the discussion on the design of a monitoring system that is used to monitor from inside-out.

Monday, December 22, 2008

Targeted advertising challenges


Since the origins of the WWW the promise of "Targeted content", "Customized Information" and grand monetization of visits through "Targeted Advertising" has been driving innovation and technologies in this space. Yet, almost 2 decades has passed since the Internet took off and was discovered by the masses and yet, the Targeted Advertising promise is not achieved. Sure, there are exceptions, but I often wonder why there have been limited (or slow) headway in this space by the industry.

At eVoice (the first free, hosted home voicemail services at the time), Targeted advertising was at the core of our business plan. The idea was that we could cover the high cost of
  • Running the service (including operating the hardware to take incoming calls, store messages, play messages, etc.)
  • Cost of setting up call forwarding with the telco's
  • Administrative overhead of eVoice
  • etc
by providing targetted advertisings to people.
We never got to execute fully on this vision because the cost running the service and telco interfaces was overwhelming and we ran out of funding and got acquired by AOL.

I realized that there was no reliable system to "target advertisements". I used the excuse that the advertising systems are still at their infancy and they have not completed their offerings. Another thing I realized was that the advertisings for any media other than web (i.e. display advertising) was virtually non-existent.
What did this mean to us at eVoice? Well, it meant we had to create our own advertisements. Kinda clunky, but we had to have something to show it was possible.

Well, now it's almost 2009 and the advertising market has changed a bit (but not much). The main player in search advertising is Google and the main player in display advertising is Yahoo... I know, I know... you can argue that the leader is Yahoo or someone else based on which publication you read.

So, what is the reason for this slow progress?

I believe there are several reasons. In order to understand it better, we have to look at the big picture of the Advertising Eco System:



Let's examine this (not complete) picture. In this scenario I have an end user going to a content provider to consume the content it provides (e.g. read their daily news from CNN.com).

  1. End user visits the content provider and asks for a web page
  2. Content provider (typically) knows what it's putting in the page. In the case of social network sites, or sites that the users are in charge the content (e.g. blogs), the content provider doesn't (or can't) digest the content users provided.
  3. Advertisers run a campaign that typically is comprised of several ways to reach their audience (i.e. end users). They have a budget for a campaign and depending on the purpose of their campaign, they would spend portions of the money on display, click through, search results or other avenues. The advertisers (at least the ones with large budgets) work with one or more ad agencies. The advertisers provide information to the ad agencies that would help the agencies effectively manage their campaigns. This information could be budget amount, How they'd like to spend it (e.g. Display Advertising, Click Through, SRP - search results page, etc).
  4. The ad agencies work with the Creative Directors to create the type of advertising creative that would fit the advertisers' budget and desire. They would then work with one or more ad exchange networks to make their creatives and campaign dollars available for consumption. The information they would provide to the advertising network includes the creative, and ... the targeting profiles. For example, the targeting profile would indicate that they want to expose the creatives to display ads that falls into the category of "Society and Culture" and subcategory of "Food and Drink". The agencies specify these categories based on how wide of a net they want to capture audiences.
  5. Ad exchange networks play the role of matching content to categories, and ultimately to advertising creatives. They take as input: (a) From Content Providers either the content categories, or whatever generic category the ad network assigned to the content provider. and (b) From advertising agencies the targeted categories for which given ads should be served. The ad exchange network also keeps track of the impression and clicks for reporting and accounting purposes.
  6. The end user gets the "targeted ad" in the page. If all goes well, the user will click on the ad if the ad is relevant enough and the user is willing (or searching) to explore more.

As you can see, it's a complex workflow. There are too many middlemen who through their involvement lose the context of the advertiser to end user connection.

There are several problems:
  • Advertisers rely on ad agencies to do the right mapping of the advertising to categories and ultimately end users. Ad agencies work with many ad networks and they have many advertisers as customers.
  • Advertisers have limited budget, so the ad agencies have to determine whether they should spread the advertisers' money on smaller number of categories and sub categories, or they should target more high level categories and not bother with sub categories. In other words, should their campaign have a razor-sharp target profile, or a wider target profile.
  • Ad exchange networks have their own categories and sub categories, and it differs from the next ad exchange network. This means ad agencies have to do more work to do higher targeting of ads (something agencies don't necessarily want to do).
  • Content providers don't necessarily have the capability or desire to understand the category/sub-category structure of the ad network they're integrating with, specially since they need to do more work to switch ad network providers, or integrate with several ad network providers. In most cases, they just pick a top level category and stick with it.
The combination of the above has the following side effect: The top categories end up with a lot more available content than the sub-category ones. Even though the advertisers (almost always) want to target at the sub-category level.

So, what's the solution? Here's my thoughts:
  • The eco system has to change so there's more transparency in the ad network, the agencies, content providers and the advertisers.
  • There has to be more available content that can be categorized in the sub-category of the Ad Exchange Network.
  • The categories have to become owned and defined by a standard body (IAB?). This reduces friction on the ad agencies and allows them to create the mapping of creatives to categories, just once.
I'm interested in hearing what others in the industry think about this topic.

Ramin

Wednesday, December 17, 2008

Sending SMS Messages from your web server



Sending SMS messages from your web server seems like something that’s relatively simple. It is, but what I found was that there are a lot of services that offer this, but for a fee. The fee structure is usually per message. Just do a Yahoo search on “SMS Gateway” and you’ll come across a ton of these.

Going with a paid service might be a good option for you, but there are other options. Almost all carriers provide an Email-to-SMS gateway, and they’re free. Therefore, the other option is to programmatically determine which email address domain to use based on the carrier for the phone number you’re interested in sending the SMS to.

There might be some restrictions on how frequent you can send messages, but I haven’t hit any limitations yet.

Here’s a list of carriers and the email addresses your web server can send an email to in order to send an SMS message to their customers:

http://en.wikipedia.org/wiki/SMS_gateways#Email_to_SMS_.2F_Web_to_SMS

About Me

Ramin Naimi
I have over 18 years of experience in various high-tech industries. I have recently been a leader in Yahoo’s platform engineering group where I lead the teams that developed and operated Yahoo’s internal monitoring and operational metrics collection systems. I have a wide range of experience from client side development to distributed servers.
View my complete profile