Psst. It's better sideways.
(Try me in landscape!)
How PursueIt is embibing data science into activities listing
At PursueIt, we're revolutionizing the way people nationwide discover and schedule activities. Our mission is to provide customers with the ideal activities that resonates with their unique preferences, all while alleviating the challenges of searching and staying up-to-date with the latest trends. Our listings are thoughtfully curated consuming the latest market trends and accentuated with our distinctive activities to provide a highly topical and relevant selection to our customers.
Our business model enables unprecedented data science, not only in recommendation systems, but also in human computation, resource management, inventory management, algorithmic activities activity and many other areas. Experimentation and algorithm development is deeply ingrained in everything that PursueIt does. We’ll describe a few examples in detail as you scroll along.
So what does the data look like? In addition to the rich feedback data we get from our clients, we also have vast datasets on a multitude of other buying behaviours such as filter engagement, key search terms, demand elasticities etc.. Our buyers and activity providers capture location and activity details, and our clients fill out a profile upon signup that’s calibrated to get us the most useful data with the least client effort.
Let's first walk through the filling of an activity schedule request to see a few of the many algorithms that play a role in that process, before zooming out to view the bigger picture.
As noted, when a client first signs up, they fill out an Activity Profile (it can be updated at any time).
At PursueIt, our mission is to revolutionize how people discover activities items that resonate with their preferences. Our customers aspire to discover impeccable activities pieces tailored to their unique clientele, all while avoiding the complexities of searching and staying up-to-date with the latest trends.
Our inventory is meticulously curated from various sources and enhanced by us, effectively bridging any gaps. This collection remains current, vast, and diverse, ensuring relevance.
The assignment of clients to activity providers is then a binary optimization problem
Leveraging comprehensive data from both ends of this 'market,' PursueIt operates as a proficient matchmaker, linking clients with activities that align with their customers' desires—activities they might never have stumbled upon independently.
The activity request is then routed to the Humans + Machines activity algorithm.
Our operational framework facilitates unparalleled implementation of data science and comprehensive insight for our buyers. This encompasses not only recommendation systems but also human computation, resource allocation, inventory supervision, algorithmic activities activity, financial strategizing, and activity aesthetics, among numerous other domains. The principles of experimentation, customer support, and algorithm advancement are deeply embedded in every facet of PursueIt's endeavors.
As you continue reading, we will delve into a few illustrative instances to provide a more detailed understanding.
However, unlike most collaborative filtering problems, we have a lot of explicit data, both from clients' self-descriptions and from activities attributes. This helps with the cold start problem and also allows for greater accuracy if we employ algorithms that consider this data.
One such approach is mixed-effects modeling, which is particularly useful because of the longitudinal nature of our problem: it lets us learn (and track) our clients' preferences over time, both individually and as a whole.
And in addition to the many explicit features available, there are some particularly pertinent latent (unstated) features of both clients and styles that we can infer from other data (structured and/or unstructured) and use to improve our performance.
For example, a new client may tell us that they want medium-paced fitness activities, but where exactly would their preference fall along the spectrum of moderately mediums to largish mediums? The same question also applies to particular styles of activities in our inventory. Note that in this illustration we’re treating pace as unidimensional for simplicity, but in fact at PursueIt we treat it as multidimensional. With clients' feedback and activity histories, we can learn where particular clients and activities fall along this spectrum. These latent features can then be used in our mixed-effects models and elsewhere.
Moving our problem even further beyond classical collaborative filtering, we also have a lot of photographic and textual data to consider: activity style photos, Instagram posts, and the vast amount of written feedback and request notes we receive from clients.
Sometimes it can be difficult to describe your activity preferences in words, but you know it when you see it—so we have our machines look at posts of activities that customers like (e.g. from Instagram), and look for similar items in our inventory.
We use trained neural networks to derive vector descriptions of 'posted' images, and then compute a cosine similarity between these vectors and pre-computed vectors for each activity in our inventory.
Natural language processing is used to score items based on the client's request note and textual feedback from other clients about the same item.
All of these algorithm scores—and many others like them—are taken into account when ordering and presenting options for the human expert activity provider to consider.
Following the completion of machine-generated rankings, the activity request is directed to a human analyst. Unlike machines, humans exhibit significant heterogeneity. Machines are uniform—pick any one. However, human analysts and their decisions are better suited for certain clients than others. Hence, we employ algorithms to optimize this pairing. To achieve this, we initiate the calculation of a match score between each available activity provider and every client who has requested an activity session within the current period. This match score is a sophisticated function influenced by the historical interactions between the client and the activity provider (if applicable), as well as the affinities shared between the client's stated and underlying preferences and those of the activities.
Subsequently, the assignment optimization problem resembles the listing assignment challenge discussed earlier.
However, there are distinctions:
(a) it only needs to consider clients awaiting activities, and
(b) the optimization problem must be recalculated more frequently to accommodate the varying queue sizes of activity providers as they engage in their work.
Was this response better or worse?
Our operational framework facilitates unparalleled implementation of data science and comprehensive insight for our buyers.
To begin listing a activity, a activity provider picks up a task in a custom-built interface designed to help them quickly and deeply understand the client.
This encompasses not only recommendation systems but also human computation, resource allocation, inventory supervision, algorithmic activities activity, financial strategizing, and activity aesthetics, among numerous other domains.
The principles of experimentation, customer support, and algorithm advancement are deeply embedded in every facet of PursueIt's endeavors.
As you continue reading, we will delve into a few illustrative instances to provide a more detailed understanding.
This wraps up the activity listing procedure, and the activity listing is now ready to be processed.
But this is just the beginning.
Client takes the activity, is hopefully delighted, and then tells us what they think about each aspect of the activity listing. There is a symbiotic relationship between them and PursueIt, and they give us very insightful feedback that we use not only to better serve them next time, but also to better serve other clients as well.
To recap the process of filling a single activity request: a client creates a Requirement Profile and requests a activity, we match them to a activity provider, we deliver the activity, and they provide us with feedback.
But this is just one activity. Zooming out, we can consider the system as a whole. At this level, two other aspects of the business become clear:
(1) We must continually replenish our inventory by buying and/or designing new activity listings for our clients, which provides an excellent opportunity to benefit from our rich data;
(2) We must anticipate our clients' needs in order to make sure that we have enough of the right resources in place at the right times.
Let's first look at how we anticipate our clients' needs, then we'll swing around to consider inventory management and new activity development.
When addressing the challenge of anticipating client needs and related issues, we adopt a perspective that involves examining the distinct "state" of each client at different moments in time. This entails considering various factors, such as whether they are new clients, if they were referred or joined independently, whether their inventory is nearly complete, whether they are in a phase of expanding or seeking sales growth, or simply looking to explore new options. These different states influence aspects like outreach frequency, email communication preferences, and more.
We meticulously document every interaction we have with each client—every product we send, all feedback received, referrals made, emails exchanged, and more. Through the analysis of this comprehensive data, our goal is to comprehend clients' states and the corresponding needs that arise within those states. By recognizing shifts in state and identifying potential triggers, we can gain valuable insights that contribute to enhancing client satisfaction.
Once we've established a clear understanding of these states and their transitions, we move towards constructing state transition matrices and developing Markov chain models. These tools empower us to delve into the broader effects within our system. Through the examination of state transitions and probabilities, we can uncover system-level dynamics and trends that further inform our strategies.
Ultimately, our approach revolves around creating a comprehensive framework that allows us to monitor, analyze, and respond to clients' evolving needs by deciphering their changing states. By employing advanced analytical techniques, we aim to ensure that our clients receive a personalized and responsive experience that aligns with their specific clientele and requirements.
One of the many uses of these chain models is to anticipate future demand, which is important because our activity providers often need to arrange resources months before it is needed in their listings. We aim to restore balance in this critical step.
Inventory depletion through customer demand must ultimately be offset by purchases of new inventory. One of the challenges is in getting the timing of purchases right, so that we maintain adequate inventory availability for activities while minimizing the sum of ordering costs and carrying costs (the operation costs and opportunity costs of capital associated with the area under the inventory curve).
Meeting future demand is just one of our inventory management challenges: we must also allocate inventory appropriately to different activity providers, and occasionally remove old inventory to make room for new activities. We can use algorithms to help us with these processes.
(Note that the situation is more complex than this simple illustration, since we must drill down to look at the availability of different types and styles of activities in each of the categories. But we'll stick to the simple illustrations here for cleanliness.)
How much of what activities to forward position? Which items should go to activity providers across tiered priorities?
We answer these questions by using a model of the system dynamics, fitting it to historical data and using it for robust optimization given quantified uncertainties in our forecasts.
Beyond addressing inventory challenges related to fragmented volume, our focus extends to continually enhancing our selection through the acquisition and development of new activity listings. This strategic approach aims to create a more delightful experience for a diverse clientele and ensures that our styles can cater effectively to a broad range of customers.
Inspired by genetic algorithms, we adopt a method that involves recombination, mutation, and a activity measure—similar to the natural selection process. Each activity style is treated as a set of attributes or "genes," and we take into account both our extensive style collection and the client feedback ("preference") available for each style.
Using this framework, we generate new activities by combining attributes from existing ones, potentially introducing minor mutations. It's important to note that the number of possible combinations is exceptionally large.
In our subsequent steps, we deviate slightly from the traditional genetic algorithm approach. Rather than solely relying on fitness for selection, we develop a model that assesses how well a specific set of attributes is likely to suit our target clients. This model guides us in highlighting attribute sets that are likely to be well-received.
Working collaboratively with our human activity providers, we meticulously vet and refine the highlighted attribute sets. This process ensures that the activities align with our brand's vision and the preferences of our clients. The refined activities then evolve into the next generation of activity listings.
In our Algorithms team, a multitude of intricacies are at play. We've introduced some of the ongoing projects within our three vertically-aligned teams: Styling Algorithms, Merch Algorithms, and Client Algorithms.
There is indeed a lot going on in our Algorithms team.
Thus far we have touched on some of the projects in our three vertically-aligned teams: activity Algorithms, Merch Algorithms & Client Algorithms.
The Data Platform team plays a crucial role in enabling efficient and effective data analysis, algorithm development, and production deployment for vertically-aligned data scientists. They accomplish this through a combination of data and compute infrastructure, as well as a set of internal Software as a Service (SaaS) products. This integrated environment abstracts away the complexities of the underlying technical aspects, allowing data scientists to primarily focus on the analytical and scientific aspects of their work.
Key aspects of the Data Platform team's role include:
Infrastructure and SaaS Products: The team provides a comprehensive ecosystem that includes both data and compute infrastructure. Additionally, they offer a suite of internal SaaS products designed to facilitate various stages of the data science workflow, from initial analysis to deployment.
Enabling Vertical Alignment: By catering to the needs of vertically-aligned data scientists, the Data Platform team ensures that the tools and resources provided are tailored to specific industry domains or use cases, enhancing the relevance and applicability of the platform.
Analysis and Algorithm Development: The platform offers capabilities that aid data scientists in conducting analysis and developing algorithms. These might involve libraries, frameworks, and tools that streamline data exploration, modeling, and experimentation processes.
Production Readiness: The Data Platform team supports the transition of data science work from development to production. This includes mechanisms for deploying algorithms as services, automating processes, and integrating with other systems in the business environment.
Technical Abstraction: The platform abstracts away technical complexities such as data distribution, parallelization, auto-scaling, and failover. This abstraction empowers data scientists to work at a higher level of abstraction, unburdened by infrastructure details.
Focus on Scalability: The platform's auto-scaling and failover capabilities ensure that data scientists can seamlessly handle varying workloads, making it possible to perform large-scale computations without manual intervention.
Collaboration and Knowledge Sharing: The Data Platform team likely fosters collaboration among data scientists, enabling them to share code, findings, and insights, which can lead to increased productivity and collective learning.
Engineers and Full-Stack Data Scientists: The Data Platform engineers are responsible for designing, developing, and maintaining the core infrastructure and tools. On the other hand, full-stack data scientists possess a blend of data science and application development skills, allowing them to bridge the gap between data-driven solutions and business requirements.
The overarching goal of the Data Platform team is to create an environment where data scientists can seamlessly progress from initial concept to full production deployment, with minimal friction caused by technical complexities. This allows them to leverage their expertise in data science while enjoying the benefits of a scalable and well-managed system provided by the Data Platform team.
The objective of this interactive tour has been to showcase the myriad ways in which data science is harnessed at PursueIt. While attempting to encapsulate our achievements within the nine stories presented above, we acknowledge the challenge in confining our scope. There is a multitude of ongoing projects, with even more concepts in the process of formulation. It's worth noting that what we recognize as applied data science is deeply rooted in a diverse realm of elegant microeconomic theory. Historically, the practical application of these theoretical concepts has been hindered by factors such as the scarcity of accessible data and computational resources. Moreover, the adoption of these ideas beyond academia has often been constrained by entrenched organizational culture, making change a formidable endeavor. However, at PursueIt, we find ourselves remarkably unencumbered by these barriers. Our unique business model has enabled us to amass a wealth of valuable data, and our culture fosters an environment in which data scientists can thrive. In this context, our distinguishing feature might simply be our adeptness at acquiring comprehensive data through our distinctive business approach, followed by the cultivation of an environment conducive to the success of data scientists. With these cornerstones in place, curiosity, innovation, and the desire to drive meaningful change become the driving forces propelling our endeavors forward.