CrossClj

0.1.7 docs

SourceDocs



RECENT

    autoclave

    Clojars

    May 20, 2014


    OWNER
    alxlit
    alxlit.name

    Readme

    Index of all namespaces


    The README below is fetched from the published project artifact. Some relative links may be broken.

    autoclave

    A library for safely handling various kinds of user input. The idea is to provide a simple, convenient API that builds upon existing, proven libraries such as JSON Sanitizer, HTML Sanitizer, and PegDown

    Installation

    :dependencies [[autoclave "0.1.7"]]
    

    Usage

    (require '[autoclave.core :refer :all])
    

    JSON

    The json-sanitize function takes a string containing JSON-like content and produces well-formed JSON. It corrects minor mistakes in encoding and makes it easier to embed in HTML and XML documents.

    (json-sanitize "{some: 'derpy json' n: +123}")
    ; "{\"some\": \"derpy json\" ,\"\":\"n\" ,\"\":123}"
    

    More information, quoted from here:

    The output is well-formed JSON as defined by RFC 4627. The output satisfies (four) additional properties:

    1. The output will not contain the substring (case-insensitively) </script so can be embedded inside an HTML script element without further encoding.
    2. The output will not contain the substring ]]> so can be embedded inside an XML CDATA section without further encoding.
    3. The output is a valid Javascript expression, so can be parsed by Javascript’s eval builtin (after being wrapped in parentheses) or by JSON.parse. Specifically, the output will not contain any string literals with embedded JS newlines (U+2028 Paragraph separator or U+2029 Line separator).
    4. The output contains only valid Unicode scalar values (no isolated UTF-16 surrogates) that are allowed in XML unescaped.

    Since JSON Sanitizer isn’t available from Maven Central or Clojars or any other repositories that I know of, its source is included locally, unmodified, in src/java/com/google/json.

    HTML

    By default, the html-sanitize function strips all HTML from a string.

    (html-sanitize "Hello, <script>alert(\"0wn3d\");</script>world!")
    ; "Hello, world!"
    

    Policies

    You can create policies using html-policy to whitelist certain HTML elements and attributes with fine-grained control.

    (def policy (html-policy :allow-elements ["a"]
                             :allow-attributes ["href" :on-elements ["a"]]
                             :allow-standard-url-protocols
                             :require-rel-nofollow-on-links))
    
    (html-sanitize policy "<a href=\"http://github.com/\">GitHub</a>")
    ; "<a href=\"http://github.com\" rel=\"nofollow\">GitHub</a>"
    

    Here are the available options (adapted from here):

    • :allow-attributes [& attr-names attr-options] Allow specific attributes. The following options are available:
      • :globally Allow the specified attributes to appear on all elements.
      • :matching [pattern] Allow only values that match the provided regular expression (java.util.regex.Pattern).
      • :matching [f] Allow the named attributes for which (f element-name attr-name value) returns a non-nil, possibly adjusted value.
      • :on-elements [& element-names] Allow the named attributes only on the named elements.
    • :allow-common-block-elements Allows p, div, h[1-6], ul, ol, li, and blockquote.
    • :allow-common-inline-formatting-elements Allows b, i, font, s, u, o, sup, sub, ins, del, strong, strike, tt, code, big, small, br, and span elements.
    • :allow-elements [f & element-names]
      Allow the named elements for which (f element-name ^java.util.List attrs) returns a non-nil, possibly adjusted element-name. Here is an example.
    • :allow-elements [& element-names] Allow the named elements.
    • :allow-standard-url-protocols Allows http, https, and mailto to appear in URL attributes.
    • :allow-styling Convert style attributes to simple font tags to allow color, size, typeface, and other styling.
    • :allow-text-in [& element-names] Allow text in the named elements.
    • :allow-url-protocols [& url-protocols] Allow the given URL protocols.
    • :allow-without-attributes [& element-names] Allow the named elements to appear without any attributes.
    • :disallow-attributes [& attr-names attr-options] Disallow the named attributes. See :allow-attributes for available options.
    • :disallow-elements [& element-names] Disallow the named elements.
    • :disallow-text-in [& element-names] Disallow text to appear in the named elements.
    • :disallow-url-protocols [& url-protocols] Disallow the given URL protocols.
    • :disallow-without-attributes [& element-names] Disallow the named elements to appear without any attributes.
    • :require-rel-nofollow-on-links Require rel="nofollow" in links (adding it if not present).

    Predefined policies

    Several policies come predefined for convenience. You can access them using the html-policy or html-merge-policies functions (see below).

    (def policy (html-policy :BLOCKS))
    
    • :BLOCKS Allows common block elements, as in :allow-common-block-elements.
    • :FORMATTING Allows common inline formatting elements as in :allow-common-inline-formatting-elements.
    • :IMAGES Allows img tags with alt, src, border, height, and width attributes, with appropriate restrictions.
    • :LINKS Allows a tags with standard URL protocols and rel="nofollow".
    • :STYLES Allows simple styling as in :allow-styling.

    Merging policies

    You can merge policies using html-merge-policies. Provide it with a sequence of option sequences or PolicyFactory objects (such as those returned by html-policy).

    (def policy (html-merge-policies :BLOCKS :FORMATTING :LINKS))
    

    Markdown

    Yes, there’s already a PegDown wrapper for Clojure (called cegdown). But this one’s got a few more features and I’m including it for the sake of completeness.

    By default the markdown-to-html function simply adheres to the original Markdown specification.

    (markdown-to-html "# Hello, \"<em>world</em>\"")
    ; "<h1>Hello, \"<em>world</em>\"</h1>"
    

    Processors

    The markdown-processor function returns a processor factory with the specified behavior. Suppose, for example, you wanted to suppress all user-supplied HTML:

    (def processor (markdown-processor :quotes
                                       :suppress-all-html))
    
    (markdown-to-html processor "# Hello, \"<em>world</em>\"")
    ; "<h1>Hello, &ldquo;world&rdquo;</h1>"
    

    It’s also thread-safe.

    Here are the available options (adapted from here):

    • :abbreviations Enable abbreviations.
    • :all Enable all extensions, excluding the :suppress-* ones.
    • :autolinks Enable automatic linking of URLs.
    • :definitions Enable definition lists.
    • :fenced-code-blocks Enable fenced code blocks via different syntaxes, one and two.
    • :hardwraps Enable interpretation of single newlines as hardwraps.
    • :none Don’t enable any extensions (default).
    • :quotes Turn single and double quotes and angle quotes into fancy entities.
    • :smarts Turn ellipses, dashes, and apostrophes into fancy entities.
    • :smartypants Enable :quotes and :smarts.
    • :strikethrough Enable strikethrough.
    • :suppress-all-html Enable both :suppress-html-blocks and :suppress-inline-html.
    • :suppress-html-blocks Suppress user-supplied block HTML tags.
    • :suppress-inline-html Suppress user-supplied inline HTML tags.
    • :tables Enable tables.
    • :wikilinks Enable [[wiki-style links]] (see below for more information).

    Link renderers

    You can customize how automatic, explicit (or inline), mail, reference, and wiki links are rendered by supplying your own LinkRenderer. The markdown-link-renderer function provides a nicer way to proxy it.

    (def link-renderer (markdown-link-renderer
                         {:auto (fn [node]
                                  {:text (->> (.getText node)
                                              (re-find #"://(\w+).")
                                              second
                                              capitalize)
                                   :href (.getText node)
                                   :attributes ["class" "autolink"]})})
    
    (def processor (markdown-processor :autolinks))
    
    (markdown-to-html processor link-renderer "http://google.com")
    ; "<a href=\"http://google.com\" class=\"autolink\">Google</a>"
    

    The available overrides are (adapted from here):

    • :auto [^AutoLinkNode node]
    • :explicit [^ExpLinkNode node ^String text]
    • :mail [^MailLinkNode node]
    • :reference [^RefLinkNode node ^String url ^String title ^String text]
    • :wiki [^WikiLinkNode node]

    They should return a map containing the link’s :text, :href, and any other :attributes (as a flat sequence of strings) as in the example above.

    Other

    License

    Copyright © 2013 Alex Little

    Distributed under the Eclipse Public License, the same as Clojure.