Class: Arachni::Analyzer
- Inherits:
-
Object
- Object
- Arachni::Analyzer
- Includes:
- UI::Output
- Defined in:
- lib/analyzer.rb
Overview
Analyzer class
Analyzes HTML code extracting forms, links and cookies depending on user
opts.
It grabs all element attributes not just URLs and variables.
All URLs are converted to absolute and URLs outside the domain are
ignored.
Forms
Form analysis uses both regular expressions and the Nokogiri parser
in
order to be able to handle badly written HTML code, such as not closed
tags and tag overlaps.
In order to ease audits, in addition to parsing forms into data
structures
like “select” and “option”, all
auditable inputs are put under the
“auditable” key.
Links
Links are extracted using the Nokogiri parser.
Cookies
Cookies are extracted from the HTTP headers and parsed by WEBrick::Cookie
@author: Anastasios “Zapotek” Laskos
<tasos.laskos@gmail.com> <zapotek@segfault.gr>
@version: 0.1-pre
Instance Attribute Summary (collapse)
-
- (Array<Hash <String, String> >) cookies
readonly
Array of extracted cookies.
-
- (Array<Hash <String, String> >) forms
readonly
Array of extracted HTML forms.
-
- (Array<String>) headers
readonly
Array of valid HTML headers.
-
- (Array<Hash <String, String> >) links
readonly
Array of extracted HTML links.
-
- (Options) opts
readonly
Options instance.
-
- (Hash<String, Hash<Array, Hash>>) structure
readonly
Structure of the html elements in Hash format.
-
- (String) url
The url of the page.
Instance Method Summary (collapse)
-
- (Array<Hash <String, String> >) get_cookies(headers)
Extracts cookies from an HTTP headers.
-
- (Array<Hash <String, String> >) get_forms(html)
TODO: Add support for radio buttons.
-
- (Hash) get_headers
Returns a list of valid auditable HTTP header fields.
-
- (Hash) get_link_vars(link)
Extracts variables and their values from a link.
-
- (Array<Hash <String, String> >) get_links(html)
Extracts links from HTML document.
-
- (Analyzer) initialize(opts)
constructor
Constructor
Instantiates Analyzer class with user options. -
- (Hash<String, Hash<Array, Hash>>) run(url, html, headers)
Runs the Analyzer and extracts forms, links and cookies.
Methods included from UI::Output
#debug!, #debug?, #only_positives!, #only_positives?, #print_debug, #print_debug_backtrace, #print_debug_pp, #print_error, #print_info, #print_line, #print_ok, #print_status, #print_verbose, #verbose!, #verbose?
Constructor Details
- (Analyzer) initialize(opts)
Constructor
Instantiates Analyzer class with user options.
95 96 97 98 99 100 101 102 103 104 105 |
# File 'lib/analyzer.rb', line 95 def initialize( opts ) @url = '' @opts = opts @structure = Hash.new @structure['forms'] = [] @structure['links'] = [] @structure['cookies'] = [] @structure['headers'] = [] @cookies = [] end |
Instance Attribute Details
- (Array<Hash <String, String> >) cookies (readonly)
Array of extracted cookies
74 75 76 |
# File 'lib/analyzer.rb', line 74 def @cookies end |
- (Array<Hash <String, String> >) forms (readonly)
Array of extracted HTML forms
62 63 64 |
# File 'lib/analyzer.rb', line 62 def forms @forms end |
- (Array<String>) headers (readonly)
Array of valid HTML headers
80 81 82 |
# File 'lib/analyzer.rb', line 80 def headers @headers end |
- (Array<Hash <String, String> >) links (readonly)
Array of extracted HTML links
68 69 70 |
# File 'lib/analyzer.rb', line 68 def links @links end |
- (Options) opts (readonly)
Options instance
87 88 89 |
# File 'lib/analyzer.rb', line 87 def opts @opts end |
- (Hash<String, Hash<Array, Hash>>) structure (readonly)
Structure of the html elements in Hash format
56 57 58 |
# File 'lib/analyzer.rb', line 56 def structure @structure end |
- (String) url
The url of the page
50 51 52 |
# File 'lib/analyzer.rb', line 50 def url @url end |
Instance Method Details
- (Array<Hash <String, String> >) get_cookies(headers)
Extracts cookies from an HTTP headers
301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 |
# File 'lib/analyzer.rb', line 301 def ( headers ) = WEBrick::Cookie.( headers ) = [] .each_with_index { |, i| [i] = Hash.new .instance_variables.each { |var| value = .instance_variable_get( var ).to_s value.strip! key = normalize_name( var ) val = value.gsub( /[\"\\\[\]]/, '' ) [i][key] = val } # detect when a cookie has been updated and discard the old one @cookies.reject!{ || ['name'] == [i]['name'] } } return end |
- (Array<Hash <String, String> >) get_forms(html)
TODO: Add support for radio buttons.
Extracts forms from HTML document
193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 |
# File 'lib/analyzer.rb', line 193 def get_forms( html ) elements = [] begin # # This imitates Firefox's behavior when it comes to # broken/unclosed form tags # # get properly closed forms forms = html.scan( /<form(.*?)<\/form>/ixm ).flatten # now remove them from html... forms.each { |form| html = html.gsub( form, '' ) } # and get unclosed forms. forms |= html.scan( /<form (.*)(?!<\/form>)/ixm ).flatten rescue Exception => e print_error( "Error: Couldn't get forms from '" + @url + "' [" + e.to_s + "]" ) return {} end i = 0 forms.each { |form| elements[i] = Hash.new elements[i]['attrs'] = get_form_attrs( form ) if( !elements[i]['attrs'] || !elements[i]['attrs']['action'] ) action = @url.to_s else action = elements[i]['attrs']['action'] end elements[i]['attrs']['action'] = to_absolute( action ) if( !elements[i]['attrs']['method'] ) elements[i]['attrs']['method'] = 'post' else elements[i]['attrs']['method'] = elements[i]['attrs']['method'].downcase end elements[i]['attrs']['action'] = to_absolute( action ) if !in_domain?( URI.parse( elements[i]['attrs']['action'] ) ) next end elements[i]['textarea'] = get_form_textareas( form ) elements[i]['select'] = get_form_selects( form ) elements[i]['input'] = get_form_inputs( form ) # merge the form elements to make auditing easier elements[i]['auditable'] = elements[i]['input'] | elements[i]['textarea'] elements[i]['auditable'] = merge_select_with_input( elements[i]['auditable'], elements[i]['select'] ) i += 1 } elements end |
- (Hash) get_headers
Returns a list of valid auditable HTTP header fields.
It’s more of a placeholder method, it doesn’t actually analyze
anything.
It’s a long shot that any of these will be vulnerable
but better be safe than sorry.
165 166 167 168 169 170 171 172 173 174 175 176 177 |
# File 'lib/analyzer.rb', line 165 def get_headers( ) return { 'accept' => 'text/html,application/xhtml+xml,application' + '/xml;q=0.9,*/*;q=0.8', 'accept-charset' => 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'accept-language' => 'en-gb,en;q=0.5', 'accept-encoding' => 'gzip;q=1.0,deflate;q=0.6,identity;q=0.3', 'from' => @opts.authed_by, 'user-agent' => @opts.user_agent, 'referer' => @url, 'pragma' => 'no-cache' } end |
- (Hash) get_link_vars(link)
Extracts variables and their values from a link
338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 |
# File 'lib/analyzer.rb', line 338 def get_link_vars( link ) if !link then return {} end var_string = link.split( /\?/ )[1] if !var_string then return {} end var_hash = Hash.new var_string.split( /&/ ).each { |pair| name, value = pair.split( /=/ ) var_hash[name] = value } var_hash end |
- (Array<Hash <String, String> >) get_links(html)
Extracts links from HTML document
277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 |
# File 'lib/analyzer.rb', line 277 def get_links( html ) links = [] get_elements_by_name( 'a', html ).each_with_index { |link, i| link['href'] = to_absolute( link['href'] ) if !link['href'] then next end if( exclude?( link['href'] ) ) then next end if( !include?( link['href'] ) ) then next end if !in_domain?( URI.parse( link['href'] ) ) then next end links[i] = link links[i]['vars'] = get_link_vars( link['href'] ) } end |
- (Hash<String, Hash<Array, Hash>>) run(url, html, headers)
Runs the Analyzer and extracts forms, links and cookies
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 |
# File 'lib/analyzer.rb', line 116 def run( url, html, headers ) @url = url msg = "[" elem_count = 0 if @opts.audit_forms @structure['forms'] = get_forms( html ) elem_count += form_count = @structure['forms'].length msg += "Forms: #{form_count}\t" end if @opts.audit_links @structure['links'] = get_links( html ) elem_count += link_count = @structure['links'].length msg += "Links: #{link_count}\t" end if @opts. << ( headers['set-cookie'].to_s ) .flatten!.uniq! @structure['cookies'] = elem_count += = @structure['cookies'].length msg += "Cookies: #{}\t" end if @opts.audit_headers @structure['headers'] = get_headers( ) elem_count += header_count = @structure['headers'].length msg += "Headers: #{header_count}" end msg += "]\n\n" print_verbose( msg ) if !only_positives? return @structure end |