Problem H
Intellectual Property
TDP Inc. has decided to sue JCN Inc. for copyright infringement. To this end, TDP wishes to find infringing segments within JCN’s code base, to show to selected media representatives. Since TDP has fired all its technical staff, it is looking to hire a consultant to be paid on a contingency basis should the lawsuit be successful. To demonstrate your qualifications for this position, you are to solve the problem on a number of test cases.
Input
Each test case begins with a positive integer $k \le 1000$, the number of infringing segments to be found. Following this line are two code bases. The first code base is preceded by the line “BEGIN TDP CODEBASE” and contains a number of lines followed by the line “END TDP CODEBASE”. The second code base is preceded by “BEGIN JCN CODEBASE” and followed by “END JCN CODEBASE”. The line “END TDP CODEBASE” does not appear in the first code base and the line “END JCN CODEBASE” does not appear in the second. Each code base consists only of printable ASCII characters (specifically ASCII codes 32–126) and newlines. A line containing $0$ follows the last test case.
Output
For each test case you should output:
-
a line containing “CASE $n$” where $n$ is the number of the test case;
-
up to $k$ infringing segments. Each segment should be printed exactly as it appears in the JCN code base (including newlines and whitespace) and should be preceded by a line containing “INFRINGING SEGMENT $m$ LENGTH $l$ POSITION $p$” where $m$ is the number of the segment within the test case, $l$ is the length of the segment (in characters) and $p$ is the position of the segment (in characters from the start of the JCN codebase). Output an empty line between test cases.
A code base is simply a string of characters. An infringing segment is a non-empty contiguous sequence of characters in the JCN code base that is textually identical to some contiguous sequence of characters in the TDP code base, and is not contained in a larger infringing segment. You should consider all characters in the code base, including spaces and the newline character at the end of each line.
If there are $k$ or fewer common segments, print them all in decreasing order by length; if several segments have the same length, print them in the order they occur in JCN’s code base. If there are more than $k$ segments, print the first $k$ according to the given order.
You may assume that no code base contains more than $50\, 000$ characters.
Sample Input 1 | Sample Output 1 |
---|---|
6 BEGIN TDP CODEBASE the quick brown fox jumps over the lazy dog. so there! END TDP CODEBASE BEGIN JCN CODEBASE now is the time for all good men to come to the aid of the party. so there! END JCN CODEBASE 100 BEGIN TDP CODEBASE xyzzy END TDP CODEBASE BEGIN JCN CODEBASE xyzzabczzyy END JCN CODEBASE 0 |
CASE 1 INFRINGING SEGMENT 1 LENGTH 12 POSITION 64 . so there! INFRINGING SEGMENT 2 LENGTH 5 POSITION 6 the INFRINGING SEGMENT 3 LENGTH 5 POSITION 42 o the INFRINGING SEGMENT 4 LENGTH 5 POSITION 43 the INFRINGING SEGMENT 5 LENGTH 5 POSITION 54 the INFRINGING SEGMENT 6 LENGTH 3 POSITION 15 fo CASE 2 INFRINGING SEGMENT 1 LENGTH 4 POSITION 0 xyzz INFRINGING SEGMENT 2 LENGTH 3 POSITION 7 zzy INFRINGING SEGMENT 3 LENGTH 2 POSITION 10 y |