The README below is fetched from the published project artifact. Some links may be broken.

# aho-corasick

A Clojure library designed to solve set matching (or multi-string matching) problem by applying the classic aho-corasick algorithm.

## Original Algorithm listing

The following algorithm listing is from the original paper “Efficient String Mathcing: An Aid to Bibliographic Search (1975)” by Alfred V. Aho and Margaret J. Corasick.

### Algorithm 1. Pattern matching machine.

**Input.** A text string x = a1a2…an where each ai is an input symbol and a pattern matching machine M with goto function g, failure function f, and output function output, as described above.

**Output.** Locations at which keywords occur in x.

**Method.** ```
begin
state := 0
for i := 1 until n do
begin
while g(state, ai) == fail do state := f(state)
state := g(state, ai)
if output(state) != empty then
begin
print i
print output(state)
end
end
end
```

### Algorithm 2. Construction of the goto function.

**Input.** Set of keywords K = {y1, y2, …, yk}.

**Output.** Goto function g and a partially computed output function output.

**Method.** We assume output(s) is empty when state s is first created, and g(s, a) == fail if a is undefined or if g(s, a) has not yet been defined. The procedure enter(y) inserts into the goto graph a path that spells out y. ``` begin newstate := 0 for i := 1 until k do enter(yi) for all a such that g(0, a) == fail do g(0, a) := 0 end

procedure enter(a1a2…am): begin state := 0; j := 1 while g(state, aj) != fail do begin state := g(state, aj) j := j + 1 end for p := j until m do begin newstate := newstate + 1 g(state, ap) := newstate state := newstate end output(state) := {a1a2…am} end ```

### Algorithm 3. Construction of the failure function.

**Input.** Goto function g and output function output from Algorithm 2.

**Output.** Failure function f and output function output.

**Method.** ```
begin
queue := empty
for each a such that g(0, a) != 0 do
begin
s := g(0, a)
queue := queue U {s}
f(s) := 0
end
while queue != empty do
begin
let r be the next state in queue
queue := queue - {r}
for each a such that g(r, a) != fail do
begin
s := g(r, a)
queue := queue U {s}
state := f(r)
while g(state, a) == fail do state := f(state)
f(s) := g(state, a)
output(s) := output(s) U output(f(s))
end
end
end
```

FIXME

## License

Copyright © 2016 FIXME

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.