Filters

Introduction

ContentGems indexes millions of articles per week. Filters allow you to shape the algorithm so that ContentGems delivers the content river for the topics you’re interested in. This article provides a thorough walkthrough of the Filters tab.

Setting Up A Filter for the First Time

Upon starting your account, ContentGems will ask for a search term to get your first set of results. 

Start with a single term or phrase. You can add more keywords and refine the content later. Then, hit the “Create Filter” button.

Navigating 

With your first Filter created, it’ll appear in the left column under the Folder “What’s New.”

Create a new Filter

Select the "+ New Filter" link from the Dashboard's Filter column, or from the Filters workspace sidebar.

Navigating Folders

Let’s quickly discuss Folders. Folders exist on the left column. The “What’s New” folder is the default, and where your first Filter will appear. The “Settings” option is where you can rename or delete the folder, and also change its order as you create new folders. 

Create a new Folder

Select the "New Folder" option from the left column.

Rename a Folder

Select the Settings option underneath your Folder. You can now change the name of your Folder.

Delete a Folder

Select the Settings option underneath your Folder. You can now delete the name of your Folder.

Change the Order of Folders

Select the Settings option underneath your Folder. You can now move a Folder up or down.

Navigating a Filter

Let’s go over the options for any given Filter. 

Along the top is the Filter’s name, a button for creating an RSS feed out of the Filter’s results, options for viewing the results, and the “Edit” button to open up additional settings.

Get Your Filter as RSS Feed 

If you have a pro account, you can obtain an RSS feed for your results to plug into a wide range of platforms, such as Buffer, Slack, MailChimp, CampaignMonitor, your favourite RSS reader, and many more.

First, click the yellow RSS button. Next, click the clipboard to copy the RSS feed address.

You can view the article on the original domain by clicking the image or headline. When you click on the copy, you’ll see additional metadata, images, and related tweets, including word count and where your search terms are in the article (the same as the info button, explained below). From there you can choose to see the full article.

Changing the Filter View

Changing the Filter view will affect how you see the articles.

In Grid View, you see a preview image, the web domain, when ContentGem found the article, an excerpt of the copy, a select box on the lower-left corner to select the article, and in the lower-right corner options to block a website, see more information about the article, share it directly, and view the number of times it has been tweeted in the ContentGems community.

In List View, you get the exact same options as in tile view. 

Lastly, in the Compact view, you can skim more articles at once, but you receive less information: there is only the headline, a shorter excerpt, the ability to select articles, and the number of tweets.

No matter the view you choose, you can select some or all of the articles to then share to your Destinations or to block the websites they come from so that you will no longer see articles from those web domains.

Let’s go more in-depth into the options with each article. Underneath are icons with additional options that help you shape your content river.

  • Select box: Use this to select multiple articles, which you can then share or block the websites they’re from
  • Info: Show additional metadata, images, and related tweets, including word count and where your search terms are in the article
  • Share: Share to your Destinations or directly to social media including HootSuite, Buffer, Instapaper, Pocket, or email
  • Block: Block any and all articles from this site 

Filter Settings

Clicking the “Edit” button in the upper right opens up the toolbar for customizing and modifying your Filter. You can change the Keywords, Sources, and Settings.

You can use the pencil icon to update the name of your Filter, delete it, or move it to another folder.

Rename a Filter

Select the pencil icon next to the Filter name. Enter the new name and select "Update".

Delete a Filter

Select the pencil icon next to the Filter name. Select the "Delete" button.

Move a Filter to a different Folder

Select the pencil icon next to the Filter name. Select the new Folder from the dropdown menu and then click the "Update" button.

The Keyword Suggester, in the yellow box, provides additional keywords that could be relevant to your search. 

For example, for “coffee shop” it will also suggest “coffeehouse” and “cafe” as potential keywords to add. You can then select from the suggestions and include them in your keywords section, and then watch like magic as more articles appear.

The keywords tab is where you can adjust the keywords and the Boolean logic for your Filter. You can have a maximum of 99 phrases total.

Boolean

Here’s a brief refresher on how Boolean works:

  • OR means that it will contain at least one of the terms you entered
  • AND means that it must contain all the terms you entered

You can then chain these commands to create deeper instruction.

With  AND not only can you say, “gimme only results that satisfy these conditions,” but you can also say what you don’t want to see. That’s where Must and Must Not come into play.

  • Must Not means that it will not include that term you entered

For our example, if you see this:

or B

AND

or D

AND

Must Not E

It means that the article must include either A or B, as well as either C or D, and it must not include a reference to E.

[ Not familiar with Boolean? Here’s our guide to getting the most out of Boolean]

Why do you want to use  Must Not? It can help you pinpoint your topic and reduce ambiguity to produce stronger results. With coffee shops, you could be more interested in the design instead of the business side, so you would exclude articles including “revenue” with the Must Not option. 

Rule kinds

There are a number of kinds of rules to help you filter articles based on their content.

  • "contain the exact phrase"
    • Use this option to specify exact terms or phrases found in an article 
    • Example: The exact term "water" matches "Water down the bridge" but doesn't match "Watermelon sugar"
    • Matching is not case-sensitive.
  • "contain words starting with"
    • Use this option to specify word prefixes found in an article
    • Example: The word prefix "water" matches both "Water down the bridge" and "Watermelon sugar"
  • "contain text similar to phrase"
    • Use this option to specify fuzzy search terms
    • Example: The fuzzy search term "color" matches both "colour" and "color"
  • "be shared with hashtag"
    • Use this option to specify the hashtag an article was shared under on Twitter. Hashtags have to match exactly, although they are not case sensitive
  • "be from Web Domain ending with"
    • Use this to specify the domain suffix under which an article is hosted. This rule is useful to specify from which kind of websites you want to get recommendations. You can, e.g., limit a search to Canadian websites by entering ".ca" in this rule with a Must application. Or you can exclude articles from a specific website by entering the Web Domain in this rule with a Must Not application
  • "match advanced query"
    • Use this to specify advanced rules for matching content. The next section on advanced query syntax will provide all the details

Advanced Query Syntax

Here are some expert tips for pinpointing your search.

Wildcards

Your keywords may include a bunch of variations. For example, you may want to learn more about paint, so in addition to “paint,” you could also search for “paints,” “painter,” “painters,” “painting,” and “paintings,” which would net you more articles. You can use a wildcard search to cut down on the duplication, just by using the  * symbol. For instance, the query paint* would look for all of the above. 

Groupings 

Parentheses allow you to create queries with nested logic. For instance, to search for content that must contain either “information” or “technology" you would include the following term:  (information technology).

Field specifiers

Field specifiers allow you to query a particular field in an article. If you don't specify a field, the term will be matched against the article's title and body text fields.

The following fields are available for searching:

  • body searches in the article body only. Example: To find articles that have the term "apple" in their body text, enter body:apple as one of your query terms.
  • domain matches the domain suffix in the article's URL. Use this to find articles from a given Web Domain, e.g., for geographic filtering. Domains are interpreted from right to left. This may be unexpected. So to match any ".uk" domains, you just enter domain:uk.
    • Example 1: To match articles from Web Domain ending in ".com.au", enter domain:com.au
    • Example 2: To match articles from a specific Web Domain, enter domain:contentgems.com.
  • excerpt searches the first 300 characters in the article's body text only. Sometimes searching this field instead of the entire body will eliminate noisy results since the most important terms are typically found at the beginning of an article. Example: To search for articles that contain the term "content marketing" at the beginning of the body text, enter excerpt:"content marketing".
  • hashtag finds articles that were shared on Twitter with this hashtag. Example: To find articles that were shared on Twitter with the "#beyonce" hashtag, enter the following: hashtag:beyonce.
  • title searches in the article title only. Example: To find articles that contain the term "green tea" in their title, enter the search term title:"green tea";.

Boosting

The ContentGems ranking system delivers relevant content into your river, and sometimes you may want to prioritize some important keywords over others. That’s where boosting comes into play: basically, boosting allows you to control the importance of a term in a search. 

To boost a term use the  ^ symbol with a boost factor (by the exponential of that number) at the end of the term. For instance, if you have a search that includes the keyword "AdWords" and want to boost this keyword then use the query AdWords^2. To boost a phrase, append the boost modifier after the closing quote: "content marketing"^10.

Any terms that don't have a field or boosting specified default to being searched in the title and body text fields. And the title gets a boost of ^25. You could accomplish the default behaviour with the following term:  (title:water^25 body:water). This is a Boolean Or query that searches for the term "water" in the article's title field with a boost factor of 25, and in the body field with no boost. This approach ranks articles with the term in the title higher than those that contain the term in the body.

Lastly, you can’t “negative boost” a keyword. So if you’ve included a  Must Not, there’s no need to boost it down further. ContentGems already will exclude that keyword.

Fuzzy matching

To match similar spellings, you can make a term fuzzy by adding a tilde and a fuzzy factor. E.g.,  ~color0.3 will match both "color" as well as "colour". Here the fuzzy factor was 0.3: the higher the fuzzy factor, the fuzzier the matches are.

More on Keywords

If you expect that all your articles will contain a specific phrase, that’s a good place to start to filter out noise: examples include unique location names, personal names, company names, product names, TV show titles, etc.—basically any phrase unique to the topic of interest.

ContentGems looks for exact phrases, so think carefully about how often that exact phrase will be used. For example, if you are wanting to learn about organic coffee beans, you will receive more articles searching for “organic”  AND “coffee beans” rather than “organic coffee beans.” There’s no guarantee that people writing about that subject will use that exact three-word phrase. If your phrase contains a hyphen, there’s no need to include it. Searching for “third wave coffee” works better than “third-wave coffee.”

Field Specifier

Some rule kinds let you choose which of an article's fields you want to apply the rule to. The default setting is to search both the article's title and body text. That works well in most situations, however, there may be cases where you want to narrow down which fields are queried.

An example of narrowing down the fields is to make sure that a given term or phrase is important in the article. Important words tend to appear in the title or near the beginning of the article. In that case, you can choose "in the title or first paragraph" or "in the title".

Sources

The next tab is for the sources that feed into your content river. Here you specify which Feed Collections you want to use for the Filter, and which web domains you want to exclude from the results.

Feed Collections

By default, your Filter will draw from the entire ContentGems fire hose. However, sometimes you may be most interested in a handful of sites, or from the Feed Collections you’ve curated.

You can limit the Feeds considered for a given Filter using the "Only include articles found in these Feed Collections" setting. Once you have organized your trusted Feeds into Feed Collections, you can include them here so that only articles from your included Feed Collections are being searched.

This is very useful for broad topics or topics that use ambiguous terms. If you are having a hard time getting good results using keywords, then you can improve things by limiting the search to Feeds that are relevant to your topic of interest.

Sometimes you may want to exclude a Feed you consider to be of low quality. In that case, you use the "Exclude these Feed Collections" setting. If you add any Feed Collections here, then Articles contained in them will be excluded from the search results.

Blocked Websites

You can block articles from certain web domains if you consider them unsuitable for your recommendations. The reasons for blocking could be either you consider them to be of low quality or to be off-topic. In order to block a web domain, just click on the "Block" icon on an article from that domain. Once blocked, the Filter will never include articles from that web domain again. 

Accidentally block a website or change your mind? No problem: you can remove blocked web domains under Filter settings > Sources, by clicking on the "X" icon next to the domain.

Settings

The third tab is Settings, which contains additional options that let you further filter down articles.

The following settings are available:

  • Minimum popularity
    • Set it to `None` to find articles that aren't popular in social media (yet). The `None` setting will likely include noise, and you should manually curate the recommended articles.
    • Set it to a higher value to only consider articles that have been vetted in social media already.  A higher setting is well suited for automated sharing without manual curation.
  • Media: must have image: Check this checkbox to include only articles that have a primary image. In some sharing scenarios, it is advantageous to add visual interest with images.
  • Minimum word count (body): If you are looking for longer articles, then set this parameter to a higher value.
  • Minimum word count (title): The number of words in the title can be used as a quality metric. Longer titles may indicate higher quality.
  • Content must be…: Limit articles to any of the pre-configured content categories.
  • Content must not be…: Exclude articles from any of the pre-configured content categories.

Other Settings

  • Rank results by:
    • This setting determines how articles are ranked before they are trimmed and sorted. Please note that this is different from sorting! Ranking determines which articles will make it into the final list (only the top-ranked articles). Once we have the set of trimmed articles, they will be sorted chronologically.
  • Remove duplicates that are
    • Use this setting to determine how aggressively duplicate articles are removed. ContentGems looks at the article's title and entire body when deciding if two articles are duplicates. It computes the Jaccard index for every pair of articles, using the article's bag of words as the set elements. The similarity settings range from a Jaccard index of 70% (slightly similar) to 95% (identical).

Usage

The fourth tab is Usage, which shows which Workflows make use of this Filter.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.