본문 바로가기

Oracle/SQL Query

Oracle Regular Expression

천정현 2009. 6. 22. 10:27

Table 4-1 SQL Regular Expression Functions and Conditions

SQL Element	Category	Description
REGEXP_LIKE	Condition	Searches a character column for a pattern. Use this function in the `WHERE` clause of a query to return rows matching a regular expression. The condition is also valid in a constraint or as a PL/SQL function returning a boolean. The following `WHERE` clause filters employees with a first name of Steven or Stephen: WHERE REGEXP_LIKE(first_name, '^Ste(v\|ph)en$')
REGEXP_REPLACE	Function	Searches for a pattern in a character column and replaces each occurrence of that pattern with the specified string. The following function puts a space after each character in the `country_name` column: REGEXP_REPLACE(country_name, '(.)', '\1 ')
REGEXP_INSTR	Function	Searches a string for a given occurrence of a regular expression pattern and returns an integer indicating the position in the string where the match is found. You specify which occurrence you want to find and the start position. For example, the following performs a boolean test for a valid email address in the `email` column: REGEXP_INSTR(email, '\w+@\w+(\.\w+)+') > 0
REGEXP_SUBSTR	Function	Returns the substring matching the regular expression pattern that you specify. The following function uses the `x` flag to match the first string by ignoring spaces in the regular expression: REGEXP_SUBSTR('oracle', 'o r a c l e', 1, 1, 'x')

Table 4-2 POSIX Metacharacters in Oracle Database Regular Expressions

Syntax	Operator Name	Description	Example
`.`	Any Character — Dot	Matches any character in the database character set. If the `n` flag is set, it matches the newline character. The newline is recognized as the linefeed character (`\x0a`) on UNIX and Windows or the carriage return character (`\x0d`) on Macintosh platforms. Note: In the POSIX standard, this operator matches any English character except NULL and the newline character.	The expression `a.b` matches the strings `abb`, `acb`, and `adb`, but does not match `acc`.
`+`	One or More — Plus Quantifier	Matches one or more occurrences of the preceding subexpression.	The expression `a+` matches the strings `a`, `aa`, and aaa, but does not match `bbb`.
`?`	Zero or One — Question Mark Quantifier	Matches zero or one occurrence of the preceding subexpression.	The expression `ab?c` matches the strings `abc` and `ac`, but does not match `abbc`.
`*`	Zero or More — Star Quantifier	Matches zero or more occurrences of the preceding subexpression. By default, a quantifier match is greedy because it matches as many times as possible while still allowing the rest of the match to succeed.	The expression `ab*c` matches the strings `ac`, `abc`, and `abbc`, but does not match `abb`.
`{m}`	Interval—Exact Count	Matches exactly `m` occurrences of the preceding subexpression.	The expression `a{3}` matches the strings `aaa`, but does not match `aa`.
`{m,}`	Interval—At Least Count	Matches at least `m` occurrences of the preceding subexpression.	The expression `a{3,}` matches the strings `aaa` and `aaaa`, but does not match `aa`.
`{m,n}`	Interval—Between Count	Matches at least `m`, but not more than `n` occurrences of the preceding subexpression.	The expression `a{3,5}` matches the strings `aaa`, `aaaa`, and `aaaaa`, but does not match `aa`.
`[ ... ]`	Matching Character List	Matches any single character in the list within the brackets. The following operators are allowed within the list, but other metacharacters included are treated as literals: Range operator: `-` POSIX character class: `[: :]` POSIX collation element: `[. .]` POSIX character equivalence class: `[= =]` A dash (`-`) is a literal when it occurs first or last in the list, or as an ending range point in a range expression, as in `[#--]`. A right bracket (`]`) is treated as a literal if it occurs first in the list. Note: In the POSIX standard, a range includes all collation elements between the start and end of the range in the linguistic definition of the current locale. Thus, ranges are linguistic rather than byte values ranges; the semantics of the range expression are independent of character set. In Oracle Database, the linguistic range is determined by the `NLS_SORT` initialization parameter.	The expression `[abc]` matches the first character in the strings `all`, `bill`, and `cold`, but does not match any characters in `doll`.
`[^ ... ]`	Non-Matching Character List	Matches any single character not in the list within the brackets. Characters not in the non-matching character list are returned as a match. Refer to the description of the Matching Character List operator for an account of metacharacters allowed in the character list.	The expression `[^abc]` matches the character `d` in the string `abcdef`, but not the character `a`, `b`, or `c`. The expression `[^abc]+` matches the sequence `def` in the string `abcdef`, but not `a`, `b`, or `c`. The expression `[^a-i]` excludes any character between `a` and `i` from the search result. This expression matches the character `j` in the string `hij`, but does not match any characters in the string `abcdefghi`.
`\|`	Or	Matches one of the alternatives.	The expression `a\|b` matches character `a` or character `b`.
`( ... )`	Subexpression or Grouping	Treats the expression within parentheses as a unit. The subexpression can be a string of literals or a complex expression containing operators.	The expression `(abc)?def` matches the optional string `abc`, followed by `def`. Thus, the expression matches `abcdefghi` and `def`, but does not match `ghi`.
`\n`	Backreference	Matches the n^th preceding subexpression, that is, whatever is grouped within parentheses, where `n` is an integer from 1 to 9. The parentheses cause an expression to be remembered; a backreference refers to it. A backreference counts subexpressions from left to right, starting with the opening parenthesis of each preceding subexpression. The expression is invalid if the source string contains fewer than n subexpressions preceding the `\n`. Oracle supports the backreference expression in the regular expression pattern and the replacement string of the `REGEXP_REPLACE` function.	The expression `(abc\|def)xy\1` matches the strings `abcxyabc` and `defxydef`, but does not match `abcxydef` or `abcxy`. A backreference enables you to search for a repeated string without knowing the actual string ahead of time. For example, the expression `^(.*)\1$` matches a line consisting of two adjacent instances of the same string.
`\`	Escape Character	Treats the subsequent metacharacter in the expression as a literal. Use a backslash (\) to search for a character that is normally treated as a metacharacter. Use consecutive backslashes (`\\`) to match the backslash literal itself.	The expression `\+` searches for the plus character (`+`). It matches the plus character in the string `abc+def`, but does not match `abcdef`.
`^`	Beginning of Line Anchor	Matches the beginning of a string (default). In multiline mode, it matches the beginning of any line within the source string.	The expression `^def` matches `def` in the string `defghi` but does not match `def` in `abcdef`.
`$`	End of Line Anchor	Matches the end of a string (default). In multiline mode, it matches the beginning of any line within the source string.	The expression `def$` matches `def` in the string `abcdef` but does not match `def` in the string `defghi`.
`[:class:]`	POSIX Character Class	Matches any character belonging to the specified POSIX character `class`. You can use this operator to search for characters with specific formatting such as uppercase characters, or you can search for special characters such as digits or punctuation characters. The full set of POSIX character classes is supported. Note: In English regular expressions, range expressions often indicate a character class. For example, `[a-z]` indicates any lowercase character. This convention is not useful in multilingual environments, where the first and last character of a given character class may not be the same in all languages. Oracle supports the character classes in Table 4-3 based on character class definitions in Globalization classification data.	The expression `[[:upper:]]+` searches for one or more consecutive uppercase characters. This expression matches `DEF` in the string `abcDEFghi` but does not match the string `abcdefghi`.
`[.element.]`	POSIX Collating Element Operator	Specifies a collating element to use in the regular expression. The `element` must be a defined collating element in the current locale. Use any collating element defined in the locale, including single-character and multicharacter elements. The `NLS_SORT` initialization parameter determines supported collation elements.This operator lets you use a multicharacter collating element in cases where only one character would otherwise be allowed. For example, you can ensure that the collating element `ch`, when defined in a locale such as Traditional Spanish, is treated as one character in operations that depend on the ordering of characters.	The expression `[[.ch.]]` searches for the collating element `ch` and matches `ch` in string `chabc`, but does not match `cdefg`. The expression `[a-[.ch.]]` specifies the range `a` to `ch`.
`[=character=]`	POSIX Character Equivalence Class	Matches all characters that are members of the same character equivalence class in the current locale as the specified `character`. The character equivalence class must occur within a character list, so the character equivalence class is always nested within the brackets for the character list in the regular expression. Usage of character equivalents depends on how canonical rules are defined for your database locale. Refer to the Oracle Database Globalization Support Guide for more information on linguistic sorting and string searching.	The expression `[[=n=]]` searches for characters equivalent to `n` in a Spanish locale. It matches both `N` and `ñ` in the string `El Niño`.

See Also:

Oracle Database SQL Reference for syntax, descriptions, and examples of the REGEXP functions and conditions

참고 문서 주소 : http://download.oracle.com/docs/cd/B19306_01/appdev.102/b14251/adfns_regexp.htm#sthref534

티스토리툴바