2.2.11.1 Common Definitions

A white space character is a character that delimits other syntax literals.

 ws           =  %x09 / %x20 / %x0A / %x0D; tab, space, line feed, carriage return

An implementation of the SharePoint Search Keyword syntax MUST support the following operators:

 words-operator = %x57.4F.52.44.53 ;case-sensitive "WORDS"
 near-operator = %x4E.45.41.52 ;case-sensitive "NEAR"
 not-operator = %x4E.4F.54 ;case-sensitive "NOT"
 and-operator = %x41.4E.44 ;case-sensitive "AND"
 or-operator = %x4F.52 ;case-sensitive "OR"
 all-operator = %x41.4C.4C ;case-sensitive "ALL"
 any-operator = %x41.4E.59 ;case-sensitive "ANY"
 none-operator = %x4E.4F.4E.45 ;case-sensitive "NONE"

Supported literals are listed as follows.

A non-terminal symbol property-quoted-string introduces string literals enclosed in double quotes. They can contain any Unicode character except double quotes. While string literals MUST NOT be limited in their length, they are limited by the length of the overall SharePoint Search Keyword query.

 property-quoted-string = %x22 1*property-dquote-char %x22 ; %x22 is double quote
 property-dquote-char   =   %x00-21 / %x23-%x10FFFF

A non-terminal symbol property-token introduces string literals not enclosed in double quotes. They can contain any Unicode character from the following Unicode General Categories: Lu, Ll, Lt, Lm, Lo, Nd, and Pc (these categories are specified in [UNICODE] section 4.5 General Category—Normative). General Category for each UNICODE code point is specified in [UNICODE3.1].

 property-token = 1*property-char
 property-char = %x30-39 / %x41-5A / %x5F / %x61-7A / %xAA / %xB5 / %xBA / %xC0-D6 / %xDF / %xE0-FF ;defined as example for the codes lower than %x100

A language literal represents one of the languages, whether spoken, written, signed, or otherwise signaled. An implementation MUST support languages as specified in the [RFC4646] section 2.1.

 property-language = langtag ; langtag is specified in [RFC4646] section 2.1

Boolean literals represent logical values and MUST be either "true" or "false":

 property-boolean = "TRUE" / "FALSE"

Numeric literals represent numbers, including positive and negative integers and floating point values.

A non-terminal symbol property-decimal introduces integer literals. A protocol server can recognize a sequence of characters as property-decimal if it matches decimal non-terminal as specified in section 2.2.13.1. Alternatively a protocol server can take into account the language in which the search query was formulated and recognize the string representation of the integer specific to that language.

A non-terminal symbol property-float introduces floating point values. A protocol server can recognize a sequence of characters as property-float if it matches float non-terminal as specified in section 2.2.13.1. Alternatively a protocol server can take into account the language in which the search query was formulated and recognize the string representation of the floating point values specific to that language.

Date literals represent specific dates. The protocol server can allow time part to be present with the date. If time part is present, the protocol server MUST ignore it.

A non-terminal symbol property-date-no-ws introduces a date literal that MUST NOT contain any white space characters (specified by ws non-terminal symbol in this section). A protocol server can recognize a sequence of characters as property-date-no-ws if it matches date-date non-terminal as specified in section 2.2.13.1. Alternatively a protocol server can take into account the language in which the search query was formulated and recognize the string representation of dates specific to that language.

A non-terminal symbol property-date introduces a date literal that can contain white space characters (specified by ws non-terminal symbol in this section). A protocol server can recognize a sequence of characters as property-date if it matches date-value non-terminal as specified in section 2.2.13.1. Alternatively a protocol server can take into account the language in which the search query was formulated and recognize the string representation of dates specific to that language.

An implementation MUST support names that represent date intervals relative to the current date as listed:

Name of date interval

Description

Yesterday

The latest day that precedes the current date.

This week

The entire week that includes the current date.

This month

The entire month that includes the current date.

Last month

The latest entire month that precedes the current month.

This year

The entire year that includes the current date.

Last year

The latest entire year that precedes the current year.

  
 property-date-named = "yesterday" / %x22 "yesterday" %x22 / %x22 "this week" %x22 / %x22 "this month" %x22 / %x22 "last month" %x22 / %x22 "this year" %x22 / %x22 "last year" %x22

A part of the query text enclosed in double quotes is a text-phrase. It MUST NOT itself contain any double quotes. A crawled item MUST match a phrase if it contains all tokens that appear between the quotes, in the same order that they appear in. As a special case, if the last character in the phrase is an asterisk character, all tokens in the phrase MUST be interpreted as prefixes. In other words, a crawled item MUST match the phrase if it contains tokens that share a common prefix with the tokens between the quotes, and these tokens appear in the same order.

 text-phrase = %x22 0*text-dquote-char %x22 ; %x22 is double quote
 text-dquote-char   =   %x00-21 / %x23-%x10FFFF ; %x22 is double quote

The protocol server MUST interpret the query text that is not a part of other syntax constructs as a sequence of tokens.  The exact way the protocol server extracts the tokens from the query text is an implementation detail, but it SHOULD take into account the language in which the search query was formulated.

 text-token = 1*( %x01-08 / %x0B-0C / %x0E-21 / %x23-27 / %x2A-10FFFF ); %x28 is "(", %x29 is ")"