cho.sh
Notes
Loading...

Keyword Matching

Time limit

2s

Memory limit

128 MB

Problem

Search engines find pages that match a user's query. Real search engines use complex algorithms to respond quickly and accurately, but this problem uses a simple relevance score between pages and queries.

Several web pages are given. Each page is classified by 1 to 8 keywords, and the keywords on a page are listed from most relevant to least relevant. For example, a page about programming in Smalltalk might have the keywords Smalltalk, programming, and computers in that order, making Smalltalk the most relevant keyword.

A query also consists of 1 to 8 keywords, listed from most important to least important. If Smalltalk appears before computers in a query, then Smalltalk is the more important keyword.

For each query, determine up to 5 pages with the highest relevance scores. To compute relevance, each keyword in a page or query receives a positional weight. The first keyword has weight 8, and each following keyword has weight one less than the previous keyword.

When a keyword appears in both a page and a query, multiply its page weight by its query weight. The relevance score of that page for that query is the sum of these products over all shared keywords.

For example, consider these three pages.

  • Page 1: Smalltalk, programming, computers
  • Page 2: computers, programming
  • Page 3: computers, Smalltalk

For the query Smalltalk programming, the scores are 113 for Page 1 (= 8 x 8 + 7 x 7), 49 for Page 2 (= 7 x 7), and 56 for Page 3 (= 8 x 7). For the query Smalltalk computers, the scores are 106 for Page 1 (= 8 x 8 + 7 x 6), 56 for Page 2 (= 7 x 8), and 112 for Page 3 (= 8 x 7 + 7 x 8).

Input

The input consists of several lines, each describing either a web page or a query. Each line starts with an identifier followed by a list of keywords.

The identifier is one of P, Q, or E.

  • P: a web page
  • Q: a query
  • E: the end of input

There is at least one whitespace character between the identifier and the first keyword. Lines starting with P and Q may appear in any order. Pages are numbered from 1 in the order their P lines appear, and queries are numbered from 1 in the order their Q lines appear.

Each page has between 1 and 8 keywords. Each keyword has length at most 20. The input contains at most 25 pages.

Each query also has between 1 and 8 keywords. Each keyword has length at most 20.

Keyword matching is case-insensitive.

Output

Print one line for each query. The line contains the query identifier, a colon (:), and then the identifiers of the most relevant pages in order.

A page identifier has the form P#, where # is the page number. A query identifier has the form Q#, where # is the query number.

For each query, print at most 5 pages with positive relevance scores, ordered by decreasing relevance score. If multiple pages have the same score, print them in increasing page number order. Never print a page whose relevance score is 0.