CrossClj

0.1.0 docs

SourceDocs



RECENT

    soup-clj

    Clojars

    Dec 26, 2013


    OWNER
    Michael Doaty

    Readme

    Index of all namespaces


    The README below is fetched from the published project artifact. Some relative links may be broken.

    soup-clj

    soup-clj is a web scraping library built with Jsoup

    Installation

    Add the following dependency to your project.clj file:

      [soup-clj "0.1.0"]
    

    Usage

    (def connection
      (setup-config {:url "http://google.com"
                     :cookie {:name "me"}
                     :user-agent "Mozilla/5.0"
                     :timeout 5000}))
    

    About configuration

    • You create your own connection - no dynamic vars
    • Each key - value pair is an argument to Jsoup connection class
    • Multiple arguments such as :cookie, take an extra map where the key is the first argument

    Example

    ;;; setup config for youtube.com
    (def conn (setup-config {:url "http://youtube.com"
                             :user-agent "Mozilla/5.0"
                             :timeout 5000}))
    
    ;;; recommended videos from youtube
    (sip conn ".yt-shelf-grid-item")
    
    
    ;;; taking the first recommended video and creating a map
    ;;; returns a map of all lookups
    (gulp (first (sip conn ".yt-shelf-grid-item"))
          {:title [".yt-lockup-title a" "title" first]
           :href [".yt-lockup-title a" "href" first]
           :date-added ["li.yt-lockup-deemphasized-text" nil #(.text %)]})
           
    ;;; returns
    {:title "title of video"
     :href "/watch?v=something"
     :data-added "# of days ago"}
           
    

    License

    Copyright © 2013 Michael Doaty

    Distributed under the Eclipse Public License, same as Clojure.