For both Latin and Greek searches, your search term is turned into a pattern that ignores distinctions between upper and lower case, and which permits the hypenation and the indexing and formatting codes that are used in the databases to intervene. In the case of Latin, w and v and i and j are treated as equivalent.
When constructing a search pattern, you may include spaces at the beginning, at the end, or in the middle of your pattern in order to indicate a word boundary. Thus the pattern "et" would find a match in the words aetas, etiam, scilicet and of course et, whereas " et" would only match in the words etiam and et. Likewise, "et " would match in scilicet and et, and if you entered the pattern " et ", you would match only the word et.
The way in which this is accomplished is by turning your pattern into a more complex pattern called a "regular expression", a notation that permits the stipulation of highly complex patterns. Diogenes permits you to use certain aspects of the Perl regular expression syntax in combination with the Perseus-style transliteration. Some of the features that you may wish to use are (NB. These are not available if you are using Beta-style input for Greek searches):
Note that Diogenes performs its own transformation of your input, so the full range of Perl regular expressions is not available to those who use the Perseus-style input, but what is available should be sufficient for many uses.
There is no space here for an extensive description of regular expressions, but for a sense of their flexibility, consider the following pattern:
" re(x|g(is?(bus)?|e[ms]?|um)) "
This pattern matches the word "rex" in all of its cases while ignoring other related words (regnum, regina, etc.). Note that the pattern begins and ends with a space, which means that the match must begin and end on a word boundary. The pattern then stipulates that the beginning word boundary must be followed by the letters "r" and "e", and that this must be followed by a complex parenthesized sub-expression that defines the rest of the word. This breaks down into: "Either the letter "x", or the letter "g" if it is followed in turn by another nested sub-expression in round parentheses. What must follow the "g" is defined as "Either the letter "i", followed optionally by the letter "s" or the letters "bus"; or the letter "e", followed optionally by either the letter "m" or "s" (but not both), or by the letters "um".
In this way, you can define searches quite narrowly, while not worrying about hyphenated words, upper and lower case, or in the case of Greek, accentuation.
P. J. Heslin, 2001-7