Here Comes Applebot: The Start of Apple Web Search?

by Posted @ May 11 2015

Twitter

A few days ago, Search Engine Land reported that Apple Confirms Their Web Crawler: Applebot in which Apple admitted that the web crawling program was being used to help with products such as Siri and Spotlight Suggestions, and Apple pointed Search Engine Land to an Apple Support help page for Applebot.

Appleinsider followed up reporting that Search at Apple was being led by people involved with their Topsy Acquisition, with the post, Apple challenges Google with growing Web Search program, fueled by Topsy Acquisition. This report that Apple’s Topsy team was leading search at Apple inspired me to want to look at some past history and review at the patents that Topsy had acquired before the purchase, and to share those here.

Topsy has been around for a few years and it’s been one of the best places to use to search through past tweets from Twitter’s firehose of data. I hadn’t looked into their ability to search the Web. Just how much did we know about their ability to possibly provide web search on a larger scale, and is it a valid thing to ask, if that was the reason Topsy was acquired by Apple?

So, I decided to grab all of the Topsy Patents and see if they could provide us with a sense of How Apple might use what is described within them to index content it finds on the Web. To make the most sense of them, it may be easiest to think of them under the purpose they were originally developed for, for searching Twitter Some features that make ranking within the Topsy patents unique are:

1) The content Topsy intends to index is social media in a micro-blogging format, like Tweets or Updates at Google+
2) There’s a focus upon indexing content in a real time manner, which would play to the strengths of formats like Twitter
3) Content topics that get mentioned over and over appear to be valued more highly.
4) The reputation of people creating content is considered in indexing that content
5) That certain things are cited in microcontent seems to indicate a level of importance to them.

Topsy Acquisition

Here are some articles I found that provided more details about the acquisition:

Topsy Pending Patent Applications

Systems and methods for customized filtering and analysis of social media content collected over social networks
Publication number: US20130297581 A1
Publication date: Nov 7, 2013
Filing date: Mar 29, 2013
Inventors: Rishab Aiyer Ghosh, Scott Park Manley
Original Assignee: Topsy Labs, Inc.

Customized-filtering-patent-flow-chart

Abstract:

A new approach is proposed that contemplates systems and methods to filter and/or rank a plurality of content items retrieved from a social network based on the sentiments expressed by the authors of the content items and/or the influence level of the authors over the social network.

  • First, content items matching a set of keywords submitted by a user are retrieved from the social network. The sentiments and/or the influence levels of the authors of these content items are then identified in real time.
  • Once identified, the sentiments and/or influence levels of the authors are used to filter and/or rank the retrieved content items to generate a search result that matches with the sentiment and/or influence level specified by the user.
  • Finally, the customized search result based on the sentiments and/or the influence levels of the authors is presented to the user.

Systems and methods for discovery of related terms for social media content collection over social networks

Publication number:US20130304818 A1
Publication date: Nov 14, 2013
Filing date: Mar 29, 2013
Inventors: Jacob Daniel Brumleve, Rishab Aiyer Ghosh, Vipul Ved Prakash
Original Assignee: Topsy Labs, Inc.

Discovery of Related Terms

Abstract:

A new approach is proposed that contemplates systems and methods to discover one or more terms related to one or more query terms submitted by a user for search over a social media network, wherein the related terms discovered are trending and co-occurring with the submitted query terms over the social media network during a specific period of time.

The terms related to the submitted keywords can be discovered based on based on various measurements that measure the trending characteristics of the terms in the social media content items collected over a period of time.

Once the terms related to the submitted keywords have been discovered, they can be utilized to search or perform aggregated metrics and analytics on the social network together with the user-submitted query terms for content items containing all or most of the query terms and/or the related terms, wherein such content items obtained are presented as the search result to the user or subject to aggregate metrics and analytics presented to the user.

Systems and methods for interactive presentation and analysis of social media content collection over social networks

Publication number: US20130297694 A1
Publication date: Nov 7, 2013
Filing date: Mar 29, 2013
Inventors: Rishab Aiyer Ghosh
Original Assignee: Topsy Labs, Inc.

Interactive Presentation

Abstract:

A new approach is proposed that contemplates systems and methods to provide a comprehensive platform that enables a user to interactively measure, display, and analyze various characteristics of the social content items collected over a social media network in real time.

The top content items, such as posts, links, photos, and videos that contains a set of query terms and are currently trending (mentioned frequently) or trending over a period of time are identified via various measurements and presented to a user.

The user is then enabled to selectively and interactively view and analyze the content items presented based on the various metrics that unique to social media content items, such as the mentions (e.g., retweets) of the content items, the authors of the content items, and the spreading of the content items.

A system and method for search of sources and targets based on relative expertise of the sources

Publication number: WO2011159992 A1
Publication date: Dec 22, 2011
Filing date: Jun 17, 2011
Priority date: Jun 17, 2010
Inventors: Rishab Aiyer Ghosh, Thomas James Emerson, Lun Ted Cui
Applicant: Topsy Labs, Inc.

Relative Expertise of Sources

Abstract:

A new approach is proposed that contemplates systems and methods to provide a ranking of citied objects and citing subjects identified as results of a search, where the relative expertise of subjects or sources of citations to said targets or objects is considered.

The relative expertise is a function of the share of the subject’s citations matching the query term or search criteria relative to the share of all subjects’ citations matching the query term, weighted by the influence of the subjects.
This allows the identification of “experts” on “topics” without any predefined categorization of topics or pre-computation of expertise. Under this novel approach, expertise can be determined on any query term in real-time.

Systems and methods for prediction-based crawling of social media network
Publication number: US20130091087 A1
Publication date Apr 11, 2013
Filing date: Oct 9, 2012
Priority date: Oct 10, 2011
Inventors: Vipul Ved Prakash, Rishab Aiyer Ghosh, Lun Ted Cui
Original Assignee: Topsy Labs, Inc.

Prediction of User-based actions

Abstract:

A new approach is proposed that contemplates systems and methods to support efficient crawling of a social media network based on predicted future activities of each user on the social network.

First, data related to a user’s past activities on a social network are collected and a pattern of the user’s past activities over time on the social network is established.

Based on the established pattern on the user’s past activities, predictions about the user’s future activities on the social network can be established.

Such predictions can then be used to determine the collection schedule – timing and frequency – to collect data on the user’s activities for future crawling of the social network.

Mediating and pricing transactions based on calculated reputation or influence scores
Publication number: US20100153185 A1
Publication date: Jun 17, 2010
Filing date: Dec 1, 2009
Priority date: Dec 1, 2008
Inventors: Rishab Aiyer Ghosh, Vipul Ved Prakash
Applicant: Topsy Labs, Inc.

User Reputation transactions

Abstract:

Mediating and pricing transactions based on calculated reputation and influence is provided.

In some embodiments, mediating and pricing transactions based on calculated reputation and influence includes determining an influence score (e.g., based on a given dimension) for a subject (e.g., a user), in which the subject is requesting a transaction; and determining approval of the transaction based on criteria including the influence score of the subject.

In some embodiments, the influence score is a directly estimated objective measure of influence (e.g., estimated using a social graph).

In some embodiments, mediating and pricing transactions based on calculated reputation and influence further includes determining pricing of the transaction based on criteria including the influence score of the subject.

In some embodiments, mediating and pricing transactions based on calculated reputation and influence also includes sharing transactional revenue for the transaction with the subject based on criteria including the influence score of the subject.

A system and method for determining quality of cited objects in search results based on the influence of citing subjects

Publication number: WO2011159646 A1
Publication date: Dec 22, 2011
Filing date: Jun 14, 2011
Priority date: Jun 14, 2010
Inventors: Vipul Ved Prakash, Lun Ted Cui, Rishab Aiyer Ghosh, Thomas James Emerson
Applicant: Topsy Labs, Inc.

Citation rankings

Abstract:

A new approach is proposed that contemplates systems and methods to examine and determine quality of objects cited by citations in a search result based on a citation graph that includes citing subjects, citations, and cited objects.

First, influence scores of a plurality of subjects/sources that compose the citations of the objects in the search result are calculated.

The quality of the objects cited by the subjects can then be determined by examining the influence scores for the subjects of the citations.

Finally, the cited objects selected can be presented to a user or provided to a thirty party for further processing together with the relevant citations and citing subjects.

 

Topsy Granted Patents

System and method for query temporality analysis
Publication number: US8892541 B2
Publication date: Nov 18, 2014
Filing date: Jun 15, 2011
Priority date: Dec 1, 2009
Inventors: Rishab Aiyer Ghosh, Thomas James Emerson, Lun Ted Cui
Original Assignee: Topsy Labs, Inc.

Citation-ranking

Abstract:

A new approach is proposed that contemplates systems and methods to determine temporality of a query in order to generate a search result including a list of objects that are not only based on matching of the objects to the query but also based on temporality analysis of the query.

Here, the temporality of the query can be defined as the distribution over time of the objects matching the query, i.e., the chronology histogram of the query.

Such distribution can be analyzed to provide a classification of the intent of the query.

Classification of the intent of the query can result either in discrete classification of the query into categories, or in continuous classification of the query which may be a scalar or vector value resulting from transformations of the chronology histogram.

Advertising based on influence
Publication number: US8768759 B2
Publication date: Jul 1, 2014
Filing date: Dec 1, 2009
Priority date: Dec 1, 2008
Inventors: Rishab Aiyer Ghosh, Vipal Ved Prakash
Original Assignee: Topsy Labs, Inc.

reputation-advertiser-influence

Abstract:

Advertising based on influence is provided.

In some embodiments, advertising based on influence includes determining an influence score (e.g., based on a given dimension) for a subject (e.g., a user), in which the subject is a potential target for an advertisement; and determining targeting of the advertisement based on criteria including the influence score of potential recipients of the advertisement.

In some embodiments, the influence score is a directly estimated objective measure of influence (e.g., estimated using a social graph).

In some embodiments, advertising based on influence also includes determining pricing of advertisements based on criteria including the influence score of potential recipients of one or more advertisements.

In some embodiments, advertising based on influence further includes sharing advertising revenue with the subject based on criteria including the influence score of the first subject (e.g., as an incentive for the subject to view the advertisement).

Ranking and selecting entities based on calculated reputation or influence scores
Publication number: US8688701 B2
Publication date: Apr 1, 2014
Filing date: Dec 1, 2009
Priority date: Jun 1, 2007
Inventors: Rishab Aiyer Ghosh, Vipul Ved Prakash
Original Assignee: Topsy Labs, Inc

calculated Reputation or Influence Scores

Abstract:

Ranking and selecting entities based on calculated reputation or influence scores is provided.

In some embodiments, a method includes determining whether a first entity is a subject or an object; determining whether a second entity is a subject or an object; and generating a graph, in which a subset of the graph is a subject graph of subject nodes that includes at least one or more subjects (e.g., subject entities) linked to one or more other subjects, and in which the graph includes one or more objects (e.g., object entities) each linked to one or more subjects in the subject graph.

In some embodiments, the graph includes directed and undirected links.

In some embodiments, the graph includes one or more objects linked to one or more objects.

Estimating influence of subjects based on a subject graph
Publication number: US8244664 B2
Publication date: Aug 14, 2012
Filing date: Dec 1, 2009
Inventors: Rishab Aiyer Ghosh, Vipul Ved Prakash
Original Assignee: Topsy Labs, Inc.

Abstract:

Estimating influence includes receiving a subject graph, in which the subject graph includes two or more subject nodes, in which each subject node corresponds to a subject; and determining an objective influence measure for each first subject node of the subject graph, in which the determination is based at least on part on a function of inward scores and outward scores, in which inward scores are computed from one or more paths leading to the first subject of a length of at least one, and outward scores are computed from one or more paths leading from the first subject of a length of at least one.

Intelligent reputation attribution platform
Publication number: US7991725 B2
Publication date: Aug 2, 2011
Filing date: Sep 30, 2010
Inventors: Rishab Aiyer Ghosh, Vipul Ved Prakash
Original Assignee: Topsy Labs, Inc.

Abstract:

Systems and methods allowing for the attribution of reputation to data sources (e.g., for the creation of referrals) are provided.

In an illustrative implementation scores (e.g., reputation scores) are determined for a target entity connected a source entity on a network on a given dimension.
In the illustrative implementation, an entity may be directly linked to any number of other entities on any number of dimensions, with each link having an associated score.

Illustratively, each dimension has an associated transitive dimension. A directed path on a given dimension between two entities, a source and a target, consists of a directed link from the source entity to an intermediate entity, prefixed to a directed path from the intermediate entity to the target entity.

In the illustrative implementation, links on the path can travel on the transitive dimension associated with the given dimension.

Intelligent reputation attribution platform (Longer Claims Section)
Publication number: US7831536 B2
Publication date: Nov 9, 2010
Filing date: Jun 1, 2007
Inventors: Rishab Aiyer Ghosh, Vipul Ved Prakash
Original Assignee: Topsy Labs, Inc.

Abstract:

Systems and methods allowing for the attribution of reputation to data sources (e.g., for the creation of referrals) are provided.

In an illustrative implementation scores (e.g., reputation scores) are determined for a target entity connected a source entity on a network on a given dimension.
In the illustrative implementation, an entity may be directly linked to any number of other entities on any number of dimensions, with each link having an associated score.

Illustratively, each dimension has an associated transitive dimension. A directed path on a given dimension between two entities, a source and a target, consists of a directed link from the source entity to an intermediate entity, prefixed to a directed path from the intermediate entity to the target entity.

In the illustrative implementation, links on the path can travel on the transitive dimension associated with the given dimension.

Other Sources

Another company Apple Acquired was Chomp Inc., which according to zdnet, was focused upon technology developed by the company as described in Apple acquires Chomp, co-founded by woman in tech Cathy Edwards The focus of their work appears to have been to make apps a lot easier to find. The algorithm they developed was not discussed in much detail in the press and was published after they had joined Apple, and Chomp Inc. is noted prominently in the provisional patent that is linked to below

Generation of topic-based language models for an app search engine
Publication number: US20120191694 A1
Publication date: Jul 26, 2012
Filing date: Apr 5, 2012
Inventors: Natalia Hernandez Gardiol, Catherine Anne Edwards
Original Assignee: Apple Inc.

These screenshots provide a condensed look at the process behind the patent:

Indexing Apps

Indexing apps

Abstract:

Topic-based language models for an application search engine enable a user to search for an application based on the application’s function rather than title.
To enable a search based on function, information is gathered and processed, including application names, descriptions and external information.

Processing the information includes filtering the information, generating a topic model and supplementing the topic model with additional information.

The provisional patent that preceded this one and is “incorporated into it is Improved Generation of Topic- Based Language Models for an App Search Engine Serial No. 61/473,672. It’s very straightforward in describing that the best way to search through apps is to search for the functions that they perform, rather than what their names might be.

subscribe to our newsletter

Leave a Comment