Nov 3rd, 2011

Performance Monitoring with Tracelytics

We’ve had great success at SeatGeek moving more and more of our software into independent services. Clear service boundaries have allowed us to improve our code quality, increase programmer productivity/happiness, open source a few things, and in some cases, drastically improve performance.

The flip-side is that with 3 or 4 different languages connecting to 3 or 4 different data stores, with the network between them, and all writing to different log files, it has become a lot more difficult to reason about performance. When you can render a page with a single (albeit complex) SQL query, it’s easy to know where to look when you want to improve response times. If rendering that same page means a simpler SQL query, plus an HTTP call to an external service which may in turn communicate with Redis, it’s a little bit tougher to figure out where you’re spending most of your time.

StatsD and Graphite

Our first solution to this problem was a combination of StatsD and Graphite. Etsy has an interesting post about their usage of this combo. In short, StatsD + Graphite gives you a very simple way to time and count things, and then graph the results in realtime. Here’s an example chart of our tickets feed load times over the past 2 months or so. You can see the result of some performance improvements we made in mid September.

This is a pretty powerful setup. We’re measuring a ton of stuff using StatsD and Graphite, and it’s a great tool for letting you know where to start looking when investigating performance issues.

Tracelytics

Recently we were lucky enough to take part in the beta of a new product called Tracelytics. While StatsD + Graphite can help you figure out where to look, Tracelytics comes right out and tells you exactly what’s wrong. Tracelytics is organized around the concept of a “trace”, which is a detailed snapshot of a specific request. A trace contains details about every layer of software involved in handling a request, even when those layers are separated by the network.

I can’t open Tracelytics without stumbling across a glaring performance issue, and I mean that literally. I just popped Tracelytics open to grab some screenshots for this post and realized that we were requesting the same object from s3 multiple times in a single request – not only did we have some bad logic which was grabbing the object over and over again, but because of incorrectly configured permissions, our per-server file cache for s3 objects wasn’t writable and so wasn’t working at all. Check it out (click to enlarge):

s3 caching issue

What you’re seeing here is the details view for a single trace. The timeline at the top shows timing information for various request layers. The bottom left is showing details about the currently selected layer (the darkest blue rectangle on the timeline), which happens to be an HTTP request to s3. I clicked on each of the other blue rectangles, and saw that they too were requests to s3. Not good. Now as if that weren’t enough information to get started on a fix, I can scroll down and get an exact stack trace for each call:

tracelytics backtrace

Remember the “performance improvements” from mid September that were illustrated in the Graphite screenshot above? Well that problem was actually uncovered and diagnosed with Tracelytics. I apologize for the crappy screenshot, but I took it just to throw in an email back in September (you can’t view historical data past a week in the Tracelytics interface yet – the data is retained, it’s just not exposed in the interface yet). Here is a heatmap view of the performance of a specific SQL query (click to enlarge):

tracelytics sql

This a very simple query pulling tickets out of a single table. We had indexes in the right places, but the query pulls a lot of data and the table has grown a lot, and probably wasn’t residing completely in memory anymore. We ended up upgrading our DB server to the next instance size and tuning some MySQL config parameters and were able to knock the average time for that query down from ~500ms to ~5ms.

Access to good data is integral to everything we do at SeatGeek, and Tracelytics gives us a ton of it, all in a very digestible way.

Nov 2nd, 2011

The Minutiae of Web Interfaces: Realism

We recently embarked on a complete redesign of the core page on our site, our ticket listings interface. After weeks of iterating with the other guys on the SeatGeek dev team, I marched off to an interview with a reporter from a major tech publication to show off the new UI for an upcoming story. His reaction: “Well, that doesn’t really look very different.” Oof.

But he was right. The two versions looked quite similar:

Before (click to enlarge):

After (click to enlarge):

Yet, even though it looks damn similar, I would argue this was a big step forward for us. In isolation, the elements that distinguish outstanding UIs from good UIs often go unnoticed by users.

A frequent 1am scenario in my apartment: I’ve just spent 15 mins fooling around in Photoshop, trying to improve the design of a button. I turn to my roommate and ask her which version she prefers. She say she can’t even tell the difference! Have I wasted my time? No. Holistically, this shit matters. A user can feel the difference.

We had four goals for this redesign: increasing the realism, usable real estate, simplicity, and unity among elements. But rather than discussing the rationale behind those, I’d like to cover some of the tiny, specific changes we made to accomplish them. The minutiae is important.

Here I’ll tackle how we attempted to improve the realism of the UI. I’ll save the rest for future posts.

Enhancing realism

Great UIs are real, tactile things that make you forget your computer screen is a screen at all. They evoke real-world sets of objects, like a panel of elevator buttons sitting in front of you. We made a number of small changes to enhance realism…

Texture

Few surfaces in the real world are truly monochromatic. The background of our old map (see above) was completely white; it felt artificial. We added a striped pattern that made it feel more organic. And few surfaces in the real world are devoid of imperfections, so we added noise to the pattern. Check out the random variations in pixel color in the closeup below:

**Pattern at normal zoom:**

**Pattern closeup:**

When asked “Do you notice any noise or imperfections in that pattern?” most users say no (I know, I’ve polled a bunch). But it subconsciously improves the verisimilitude of the UI. We also added noise to the ticket quantity selector background. Again, it’s difficult to consciously detect unless you zoom in:

**Quantity selector:**

**Background closeup:**

Depth

The perception of depth is critical for creating a UI that feels like a real-world control panel rather than a computer screen. The real world isn’t 2D. Our old map felt flat, whereas the new version has more dimension.

**Old map** (flat):

**New map** (depth!):

We did a few things to give the illusion of depth. First, we added a drop shadow around the edge of the map. The shadow is positioned as if there is a 90° light source, meaning light is shining down on the interface from directly above. This has become the standard across mobile and web interfaces. The shadow is bigger and darker than most shadows we use, but still sufficiently subtle that most users won’t look at the interface and think “Oh, there’s a shadow.”

We also added a stronger gradient to the markers on the map:

**Old marker:**

**New marker:**

The stronger gradient evokes a marker that is spherically shaped rather than flat. If the marker were a flat disc, then light hitting the disc from above would look the same on all places of the disc. But if the marker was protruding off of the map, light from above would strike more of the upper half of the marker than the lower half. The stronger gradient conveys this.

We worked to give every element on the page a z-index relationship to the elements surrounding it. For text, that means deciding if it is sitting on top of its background element or if it will be embossed into that element. We tried to always select one or the other and avoid text that casts no shadow. As an example, check out the label in the upper-left part of the ticket listing element:

**Old version:**

**New version:**

The text in the old version had no shadow. In the new version, we used a white drop shadow on the bottom to convey that the text is embossed into the background. It gives the perception that light shining from above is passing over much of the label–since it’s sunken into the background–but that the light catches it at the bottom.

Active/hover states

In the real world, an object tends to change its appearance when you interact with it. As you move your hand over a button in an elevator, the way light strikes that button will change. And when you press that button, the light will change yet again. Thus, to create a UI that feels real, we tried to make it consistently but subtly responsive to user actions. As a bonus, this gives users consistent feedback about when their actions are being recognized by the app.

Perhaps the most obvious way to enhance responsiveness is with hover and active states. We’re trying to give every clickable element in the SeatGeek app its own hover and active state (we aren’t quite there yet). As an example, consider the email alert button on the top bar of the ticket listings UI (refer to the screenshots above for a refresher):

The old button is on the left; the new button is on the right. You’ll see that while the old button had a hover state (underlining the text) it didn’t have a unique active state, so we created that for the new version. In the new version, the button color and shading changes in all three states. We’re also trying to get away from using text underlining for hover states. Buttons in the real world don’t underline themselves when you put your hand over them. An underline hover state is often unavoidable for links in body copy, but we got rid of it for buttons, like in the example above.

Animation

Things in the real world do not instantly appear or disappear from sight; they move in and out of view with a certain velocity and acceleration. Thus, we added animations to the interface whenever it seemed suitable. We actively tried to avoid gaudy animations reminiscent of ‘90’s PowerPoint presentations; we wanted each animation to be on the brink of noticeable. As an example. when a user signs up for an email alert, the modal window fades and zooms in very quickly. This screenshot catches it mid-animation:

There has been much ado recently about how Steve Jobs cared about the aesthetics of computer parts users would never see. I’m not sure how I feel about that. (Where did it end? Did he he care about the beauty of chip internals?!). In any case, I’m talking about something very different here–changes that will be visible to users, but only when part of a cohesive whole. I think those details are the key to outstanding interfaces.

Sep 19th, 2011

Hiring Challenges Shouldn’t Be Limited to Developers

Last week we began recruiting for a new Director of Communications. The person we hire must be an outstanding writer, thus we’ve already spent hours combing through writing samples from applicants. Candidates have submitted a surprisingly diverse range of samples; we’ve received academic papers, articles in student newspapers, email pitches, and press releases. How can we compare these writing samples against one another, apples-to-apples? Moreover, which of these are accurate proxies for the type of content our Director of Communications will be producing?

We’re interested in stretching the bounds of the traditional hiring processes. For example, in order to apply to be a web developer at SeatGeek, an applicant must “hack” into our backend to drop their resume. As a result, we don’t get distracted by unqualified candidates and can thus spend more time on the strongest coders.

Introducing WorkatSeatGeek.com

This morning we realized that our screening process for the Director of Communications job was broken. We brainstormed what we hope is a better solution: WorkAtSeatGeek.com. Before describing what that is, here’s a description of the role. Our PR strategy uses the mountain of ticketing data we’ve collected over the past few years. Whenever a big story in sports or music breaks, we try to quantify fan sentiment through ticket prices. Reporters love this data; we get 3-4 press mentions per week. Examples of how we utilize our data are available on our blog and press page.

If this is a role that interests you, here’s how to apply using WorkAtSeatGeek.com:

Email write@seatgeek.com with your resume attached The email address will auto-respond with instructions on how to access a ticketing dataset One you receive the data, use it to write a blog post of up to 300 words with a graph or chart. It’s unlikely that all the data-points in the dataset will be relevant the story you choose. Post your article on workatseatgeek.com. You can easily create an account by going to http://workatseatgeek.com/wp-login.php?action=register For obvious reasons, we will only display the handle you choose and never your full name or email address Readers will vote on the most compelling posts by sharing them on Google Plus, Twitter, and Facebook. We encourage applicants to accumulate these social shares by actively promoting their pieces. In fact, the strongest applicants will probably be able to get legitimate press coverage. Just as we focus on engineers that solve our developer challenge, we’ll focus our interviews on the handful of writers with the best-written and most-shared articles. Best of luck!

Sep 8th, 2011

What the SeatGeek R&B Star Devs Listen to While Programming

programming music at turntablefm

Devs are gonna dev. Coders gonna code. But they aren’t going to write blog posts, which is why I am here…except here, here, here, and here. Aight so they do some content work, but I can’t complain at all because they MAKE SeatGeek what it is. I just monkey market.

Programmers tend to prefer house, techo, electro, dub, metal, but let’s take a look at what music our actual developers are listening to on a day-to-day basis. In addition, I have aggregated some of the top threads from around the web at Reddit, HackerNews and other programmer hot spots and provided links to those at the bottom.

Best Music to Code to - SeatGeek Edition

Adam Cohen - of Bitcoin discussion fame and other musings

Who: Jamiroquai Genre: Funk/acid jazz Why: Driving beats, fast tempo, keeps you awake. All songs kind of sound the same. Want a soundtrack that blends together. Best Song to Program to:

Michael D’Auria of homemade brew fame

Who: all hip hop Genre: Rap/Hip-Hop Why: 1. High energy 2. Bobbing heads = mad lines of code 3. 5 Milkshakes Best Song to Program to (explicit):

Jose Diaz-Gonzalez, of Cake PHP fame

Who: The Mars Volta Genre: Psychedelic rock/free jazz (not sure what that means) Why: Many, many, many loud and obnoxious drums that I use to set the pace. Pisses off everyone around me, thereby making me a happier person as I suck the fun out of the air. Also, lots of variety, so I can work to various tempos and beats Best Song to Program to:

And…

Eric Waller, The Fame

Who: fratmusic.com Genre: various Why: You’re not gonna not Best Song to Program to:

Best Music to Program To - Answers Around the Web

HackerNews

What Music do you listen to while programming
Best music to code to: Reddit thread

Other

BONUS:This is what we partied to at the last SeatGeek party

Which you can find on Grooveshark here.

What do you listen to while programming? Does our dev team have it all wrong? Let us know on Twitter.

Aug 23rd, 2011

What an Earthquake Does to Page Response Times

You might have heard – there was an earthquake in Virginia which was felt in New York City. Twitter is exploding with east-coasters experiencing their first earthquake.

Over here at SeatGeek, we were excitedly discussing the tremor when Mike, our trusty sysadmin, realized that our Amazon AWS servers were all in Virginia, right near the epicenter. Did it impact the service at all?

It turns, out, it did. For about six months, we’ve been using a combination of StatsD, Graphite, and GeckoBoard to power a real-time dashboard of some of our system stats. We walked to the front to of the office to take a look, and sure enough, we saw a pretty nasty looking page response spike.

earthquakes make web servers sad

Lessons Learned

Earthquakes make Web Servers sad
Real time system monitoring is awesome

Jul 21st, 2011

How SeatGeek Measures PR Coverage

SeatGeek is a data-obsessed company and there’s no set of numbers more fun to track than company metrics. In every corner of the SeatGeek office hang television screens or posters where trends on web traffic, server response time, and revenue are prominently displayed. For awhile, PR was one of the rare aspects of our business that eluded our quantitative efforts. The obvious measurements like PR traffic and the count of PR mentions seemed in isolation to do a poor job conveying the success of our efforts. We recently devised a better framework for measuring the value of each press hit. Our solution was to decompose each PR mention into a series of objective criteria and create a formula to score each hit based on what we deem most important. This allows us to set ambitious, measureable goals of what we want to achieve in PR and track progress on a weekly basis.

Before addressing how we quantify PR, it is worth spending a few moments discussing SeatGeek’s PR strategy. Our PR coverage tends to fall into two buckets: feature coverage and data mentions. The former doesn’t require much explanation; these are articles in business or tech press about SeatGeek like this piece in Entrepreneur Magazine or in BusinessWeek. Feature coverage tends to be lumpy. New investors, partnerships, and features are noteworthy events, but these occur irregularly. While feature coverage is the best type of PR, we need to supplement it with more frequent data coverage.

SeatGeek sits on a gold mine of sports and music ticketing data, and we use this data to shed unique insight into fan sentiment. For example, when Derek Jeter closed in on his 3000th hit, we noticed that ticket prices spiked on the secondary ticket market. Eager fans shelled out $181 on average for the Thursday night game against the Rays, 224% higher than face and 258% greater than the average for the season. Ben Kessler, our Director of Communications, analyzed the data and shared it with reporters who included it when discussing the game. Every week there are stories in sports and music about an artist going on tour, a team riding a winning streak, or a player getting traded. We can measure fan reaction through ticket prices.

SeatGeek has a simple backend module where our team enters all press mentions. When we enter each mention, we mark whether the article made it to print, if we got a link, if someone on our team was quoted, and other metrics. Importantly, we avoid subjective criteria. It’s tempting to have a scale to rank the “prestige of publisher that featured SeatGeek” but two people could arrive at very different values. Instead, to measure something like the legitimacy of the publisher, we’d use the PageRank of the domain’s homepage.

Our current formula to score each article is:

PageRank + 5*isFeatureArticle + isLinked*1.2^PageRank +
2*isTelevised + 0.3*(isPrint + isQuoted) +
if(Referral traffic in 48 hours following article > 500,  traffic / 100) +
isFeatureArticle * isSportsWriter

SeatGeek’s PR formula is a direct representation of our business goals. We reward feature coverage and links from high page rank sites because direct and search engine traffic are our fastest-growing user acquisition channels. Startups not focused on SEO may not attach the same emphasis here so there certainly isn’t a one-size-fits-all approach to measuring PR.

Establishing a quantitative framework for measuring PR facilitates goal-setting. Every week we aspire to get around 40 PR points, which usually amounts to around 4 press hits. We expect to ramp up to 65 weekly PR points by the end of the year because getting PR becomes easier when you have established relationships with reporters. Having all our press scores in a database lets us easily visualize our data and our customer-facing press section is always up to date.

Jul 8th, 2011

FuzzyWuzzy: Fuzzy String Matching in Python

seatgeek open sourced seatgeek/fuzzywuzzy

Fuzzy String Matching in Python

We’ve made it our mission to pull in event tickets from every corner of the internet, showing you them all on the same screen so you can compare them and get to your game/concert/show as quickly as possible.

Of course, a big problem with most corners of the internet is labeling. One of our most consistently frustrating issues is trying to figure out whether two ticket listings are for the same real-life event (that is, without enlisting the help of our army of interns).

To pick an example completely at random, Cirque du Soleil has a show running in New York called “Zarkana”. When we scour the web to find tickets for sale, mostly those tickets are identified by a title, date, time, and venue. Here is a selection of some titles we’ve actually seen for this show:

Cirque du Soleil Zarkana New York
Cirque du Soleil-Zarkana
Cirque du Soleil: Zarkanna
Cirque Du Soleil - Zarkana Tickets 8/31/11 (New York)
Cirque Du Soleil - ZARKANA (Matinee) (New York)
Cirque du Soleil - New York

As far as the internet goes, this is not too bad. An normal human intern would have no trouble picking up that all of these listings are for the same show. And a normal human intern would have no trouble picking up that those listings are different than the ones below:

Cirque du Soleil Kooza New York
Cirque du Soleil: KA
Cirque du Soleil Zarkana Las Vegas

But as you might imagine, we have far too many events (over 60,000) to be able to just throw interns at the problem. So we want to do this programmatically, but we also want our programmatic results to pass the “intern” test, and make sense to normal users.

To achieve this, we’ve built up a library of “fuzzy” string matching routines to help us along. And good news! We’re open sourcing it. The library is called “Fuzzywuzzy”, the code is pure python, and it depends only on the (excellent) difflib python library. It is available on Github right now.

String Similarity

The simplest way to compare two strings is with a measurement of edit distance. For example, the following two strings are quite similar:

NEW YORK METS
NEW YORK MEATS

Looks like a harmless misspelling. Can we quantify it? Using python’s difflib, that’s pretty easy

from difflib import SequenceMatcher
m = SequenceMatcher(None, "NEW YORK METS", "NEW YORK MEATS")
m.ratio() ⇒ 0.962962962963

So it looks like these two strings are about 96% the same. Pretty good! We use this pattern so frequently, we wrote a helper method to encapsulate it

fuzz.ratio("NEW YORK METS", "NEW YORK MEATS") ⇒ 96

Great, so we’re done! Not quite. It turns out that the standard “string closeness” measurement works fine for very short strings (such as a single word) and very long strings (such as a full book), but not so much for 3-10 word labels. The naive approach is far too sensitive to minor differences in word order, missing or extra words, and other such issues.

Partial String Similarity

Here’s a good illustration:

fuzz.ratio("YANKEES", "NEW YORK YANKEES") ⇒ 60
fuzz.ratio("NEW YORK METS", "NEW YORK YANKEES") ⇒ 75

This doesn’t pass the intern test. The first two strings are clearly referring to the same team, but the second two are clearly referring to different ones. Yet, the score of the “bad” match is higher than the “right” one.

Inconsistent substrings are a common problem for us. To get around it, we use a heuristic we call “best partial” when two strings are of noticeably different lengths (such as the case above). If the shorter string is length m, and the longer string is length n, we’re basically interested in the score of the best matching length-m substring.

In this case, we’d look at the following combinations

fuzz.ratio("YANKEES", "NEW YOR") ⇒ 14
fuzz.ratio("YANKEES", "EW YORK") ⇒ 28
fuzz.ratio("YANKEES", "W YORK ") ⇒ 28
fuzz.ratio("YANKEES", " YORK Y") ⇒ 28
...
fuzz.ratio("YANKEES", "YANKEES") ⇒ 100

and conclude that the last one is clearly the best. It turns out that “Yankees” and “New York Yankees” are a perfect partial match…the shorter string is a substring of the longer. We have a helper function for this too (and it’s far more efficient than the simplified algorithm I just laid out)

fuzz.partial_ratio("YANKEES", "NEW YORK YANKEES") ⇒ 100
fuzz.partial_ratio("NEW YORK METS", "NEW YORK YANKEES") ⇒ 69

That’s more like it.

Out Of Order

Substrings aren’t our only problem. We also have to deal with differences in string construction. Here is an extremely common pattern, where one seller constructs strings as “<HOME_TEAM> vs <AWAY_TEAM>” and another constructs strings as “<AWAY_TEAM> vs <HOME_TEAM>”

fuzz.ratio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Mets") ⇒ 45
fuzz.partial_ratio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Mets") ⇒ 45

Again, these low scores don’t pass the intern test. If these listings are for the same day, they’re certainly referring to the same baseball game. We need a way to control for string construction.

To solve this, we’ve developed two different heuristics: The “token_sort” approach and the “token_set” approach. I’ll explain both.

Token Sort

The token sort approach involves tokenizing the string in question, sorting the tokens alphabetically, and then joining them back into a string. For example:

"new york mets vs atlanta braves"   →→  "atlanta braves mets new vs york"

We then compare the transformed strings with a simple ratio(). That nicely solves our ordering problem, as our helper function below indicates:

fuzz.token_sort_ratio("New York Mets vs Atlanta Braves", "Atlanta Braves vs New York Mets") ⇒ 100

Token Set

The token set approach is similar, but a little bit more flexible. Here, we tokenize both strings, but instead of immediately sorting and comparing, we split the tokens into two groups: intersection and remainder. We use those sets to build up a comparison string.

Here is an illustrative example:

s1 = "mariners vs angels"
s2 = "los angeles angels of anaheim at seattle mariners"

Using the token sort method isn’t that helpful, because the second (longer) string has too many extra tokens that get interleaved with the sort. We’d end up comparing:

t1 = "angels mariners vs"
t2 = "anaheim angeles angels los mariners of seattle vs"

Not very useful. Instead, the set method allows us to detect that “angels” and “mariners” are common to both strings, and separate those out (the set intersection). Now we construct and compare strings of the following form

t0 = [SORTED_INTERSECTION]
t1 = [SORTED_INTERSECTION] + [SORTED_REST_OF_STRING1]
t2 = [SORTED_INTERSECTION] + [SORTED_REST_OF_STRING2]

And then compare each pair.

The intuition here is that because the SORTED_INTERSECTION component is always exactly the same, the scores increase when (a) that makes up a larger percentage of the full string, and (b) the string remainders are more similar. In our example

t0 = "angels mariners"
t1 = "angels mariners vs"
t2 = "angels mariners anaheim angeles at los of seattle"
fuzz.ratio(t0, t1) ⇒ 90
fuzz.ratio(t0, t2) ⇒ 46
fuzz.ratio(t1, t2) ⇒ 50
fuzz.token_set_ratio("mariners vs angels", "los angeles angels of anaheim at seattle mariners") ⇒ 90

There are other ways to combine these values. For example, we could have taken an average, or a min. But in our experience, a “best match possible” approach seems to provide the best real life outcomes. And of course, using a set means that duplicate tokens get lost in the transformation.

fuzz.token_set_ratio("Sirhan, Sirhan", "Sirhan") ⇒ 100

Conclusion

So there you have it. One of the secrets of SeatGeek revealed. There are more tidbits in the library (available on Github), including convenience methods for matching values into a list of options. Happy hunting.

Feb 14th, 2011

Announcing Soulmate

seatgeek open sourced seatgeek/soulmate

Redis-backed service for fast autocompleting

Have you ever felt so close to someone that it seemed like the two of you were finishing each other’s sentences? Well, as a Valentine’s Day gift to the community, we at SeatGeek have distilled some of Cupid’s magic into a Redis-backed service for doing exactly that: Soulmate is a tool for building fast autocompleters.

Give it a try right now on SeatGeek.

Inspired by Auto Complete with Redis, Soulmate uses sorted sets to build an index of partially completed words and the corresponding top matching items, and provides a simple sinatra app to query them.

Here’s a quick overview of what the initial version of Soulmate supports:

Provide suggestions for multiple types of items in a single query (at SeatGeek we’re autocompleting for performers, events, and venues)
Results are ordered by a user-specified score<
Arbitrary metadata for each item (at SeatGeek we’re storing both a url and a subtitle)

Checkout the github repo for instructions on how to use it, or if you’re feeling saucy, just gem install soulmate.

Oct 25th, 2010

Henceforth, All Job Applicants Must Hack Into Our Backend

screenshot

The early stage of the hiring process has a huge signal-to-noise problem. A job posting on one of the standard career sites garners hundreds of resumes, but most are poor, and sorting cruft takes countless hours. Outstanding web developers do not generally spend their time trolling the job listings on Craigslist. They do, however, enjoy puzzles.

Therefore, we’re changing the application process for our web developer position. All applicants must now submit their resume by solving a puzzle: they must hack into our backend jobs admin panel.

Admittedly, this is contrived. We didn’t have a backend jobs admin till last week, when Eric and Mike made it for the purpose of this challenge. But it should be a fun challenge for any dev up to the task. Anyone who successfully submits their resume will be carefully considered. Even if you aren’t looking for a job, feel free to give it a spin and drop us a line at hi@seatgeek.com to let us know what you think.

Note: All errors you run into are intentional. A blank page should be considered an error. If you see a blank page, your resume has not been submitted.

Here’s the job description: https://seatgeek.com/jobs

Note: This challenge has been discontinued.

Oct 21st, 2010

Announcing DJJob - a PHP Port of Delayed_job

seatgeek open sourced seatgeek/djjob

Database backed asynchronous priority queue – A PHP port of delayed_job

DJJob is our database-backed job system that allows PHP web applications to process long-running tasks asynchronously. It is a nearly direct port of delayed_job, one of the most popular Ruby/Rails job processing systems.

A few months ago, I went searching for a PHP-friendly equivalent of the many popular queue/worker-based job systems available to Ruby apps. But the best I could find was a few pointers to pcntl_fork. There are a few language agnostic-systems, but in general they seem more complicated and often rely on additional moving parts.

Delayed_job’s design is attractive because most people are already running a SQL db, and storing jobs in the db instantly buys you a pretty full featured job management system in phpMyAdmin.

The result is that we decided to port delayed_job to PHP and bring simple, robust job processing to PHP web apps.

Take a look at the github repo to see how to use it and to get the code.

← Older Blog Archives Newer →

Code, Design, and Growth at SeatGeek

Jobs at SeatGeek