## Document: Important Guidelines
### Overview
The image presents a set of guidelines, likely for evaluating or categorizing patterns in data mappings, particularly in the context of natural language processing or similar tasks. The guidelines cover aspects such as pattern recognition, description accuracy, and mapping categories.
### Components/Axes
The document is structured as a list of bullet points, each representing a specific guideline or instruction. There are no axes or scales in this image.
### Detailed Analysis or Content Details
Here's a transcription of the text, organized by bullet point:
* **Important guidelines:**
* In Q1, we consider that "GPT4 indicated there is no pattern" if it either responded with the word "Unclear", or explained that there is no pattern in a sentence.
* In cases where the description of the model includes suggestive commentary about the hidden motivation for the function represented in the mappings (in addition to an explicit explanation), the commentary should not be considered. An example for a description which includes commentary is "The mappings generally consist of repetitions or small variations of their corresponding input string's characters, *suggesting a pattern related to breaking down or rearranging the input string*".
* We consider a pattern *recognizable* when it is apparent across 20 or more mappings. We require that *at least one* of the following will hold:
* The functionality behind the mappings (of input to output strings) will be visible and clear - for example, mappings of words to their first letters.
* The destination strings will be highly related to each other - for example, cases where all the source strings are mapped to numbers.
* In cases where there is a mutual pattern encompassing *only* the source strings, we do not consider this as a recognizable pattern.
* In Q2 we use the terms *correct* and *accurate* to label the descriptions. *Correct* descriptions describe the mappings and do not include incorrect parts. *Correct* descriptions might be *accurate* or *inaccurate*. The *inaccuracy* metric refers to whether the descriptions are too general (or too specific).
* In Q3, the different mapping categories are:
* *Semantic* - the mapping encodes semantic associations of the input strings (which might require knowledge). For example, associating countries with their capitals or languages.
* *Language* - the mapping encodes a relationship which requires language knowledge (e.g. syntactic or lexical expertise) relationship. For example, mapping words to prefixes, or nouns to pronouns.
* *General* - the mapping encodes a general functionality, which naturally can be applied to a large subset of strings. For example, mapping a string to itself, or a number to its successor/predecessor.
* *Unnatural* - the mapping *does not* encode a recognizable/understandable function or relation, one that might be used for natural language processing (see examples of unnatural patterns in *the examples spreadsheet*).
* Please use the Notes column to add any information, insight or problem you find relevant.
### Key Observations
* The guidelines emphasize the importance of clear and explicit descriptions of patterns.
* Recognizability is linked to the frequency of a pattern's occurrence (at least 20 mappings).
* The document defines specific categories for classifying mappings: Semantic, Language, General, and Unnatural.
* The guidelines refer to external resources, such as "the examples spreadsheet".
### Interpretation
The document provides a framework for evaluating and categorizing patterns in data mappings. The guidelines aim to ensure consistency and accuracy in the description and classification of these patterns. The distinction between different mapping categories (Semantic, Language, General, Unnatural) suggests a hierarchical approach to understanding the relationships between input and output strings. The reference to "GPT4" and "natural language processing" indicates that these guidelines are likely used in the context of evaluating the performance or behavior of AI models.