Host-extract

From aldeid
Jump to: navigation, search

Description

Aung Khant from the YGN Ethical Hacker Group has developed a tool that parses the content of web sites to reveal clues in the source code.

The tool is capable of detecting architecture details, the usage of cache systems, Content Delivery Network (CDN), load balancers, local IP addresses.

Thanks to my friend Aung Khant for his reviews.

Installation

Prerequisites

You will need ruby rubygems

$ sudo apt-get install ruby

Then install mechanize via gem:

$ sudo gem install mechanize

host-extract.rb

$ mkdir -p /pentest/enumeration/www/
$ cd /pentest/enumeration/www/
$ svn co http://host-extract.googlecode.com/svn/trunk/ host-extract

Usage

host-extract.rb script

Syntax

To run host-extract against only one URL, use following syntax:

$ ruby host-extract [options]

Options

-a
find all ip/host patterns
-j
scan all js files
-c
scan all css files
-v
append view-source html snippet for manual verification

run.sh script

Launcher

To run it against a list of URLs, a script is included (run.sh). Feed the url-list file with the URLs to check (one per line) and use start the script as follows:

$ ./run.sh url-list 
----------------------------------------------
Extracting IP/Domain Patterns from url-list
----------------------------------------------

==================================================================
IP/Host Pattern Extractor (c) Aung Khant, aungkhant[at]yehg.net
  YGN Ethical Hacker Group, Myanmar, http://yehg.net/
svn co http://host-extract.googlecode.com/svn/trunk/ host-extract
==================================================================
Target: http://www.aldeid.com/
host: www.aldeid.com
path: /
-> http://www.aldeid.com/ | 301
(Redirected to : http://www.aldeid.com/wiki/Main_Page)

host: www.aldeid.com
path: /wiki/Main_Page

[*] searching for internal IP patterns ...
[x] no internal IP(s) found
[*] searching for IP/domain patterns ...
  - aldeid.com
      #view-source: ..  content="aldeid.com is a wiki about network and web applica
  - www.gnu.org
      #view-source: .. f="http://www.gnu.org/copyleft/fdl.html" />
[...Truncated...]

Notice that the script runs host-extract.rb script with the -a (find all ip/host pattern) option.

url-list file

The url-list (or whatever name you want to give) is a file that contains a list of URLs to check. Each entry must be placed on a new line.

$ cat url-list
http://www.aldeid.com/
http://www.google.fr/
http://www.mcafee.com/
http://www.sophos.fr/
http://www.amazon.com/
http://www.twitter.com/

Examples

Cache system

The following example emphasizes the presence of cache servers:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:02:05
# --------------------------------------------------------------------------
# URL: http://gawker.com
# [*] searching for internal IP patterns ...

# [x] no internal IP(s) found

# [*] searching for IP/domain patterns ...
www.facebook.com
      #view-source: .. b="http://www.facebook.com/2008/fbml">
gawker.com
      #view-source: ..   grpid: 'gawker.com',
betacache.gawkerassets.com
      #view-source: .. c="http://betacache.gawkerassets.com/assets/base.v10/js/gome
fastcache.gawkerassets.com
      #view-source: .. f="http://fastcache.gawkerassets.com/assets/base.v10/css/../
v10.gawker.com
      #view-source: .. om/assets/v10.gawker.com/css/style.css?rev=20110311" />
www.google.com
      #view-source: .. c="http://www.google.com/jsapi"></script>
[...TRUNCATED...]

Content Delivery Network (CDN)

The following example emphasizes the usage of CDN servers:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:01:35
# --------------------------------------------------------------------------
# URL: http://digg.com
# -> http://digg.com | 302
# (Redirected to : /news)

#########################################################################
# URL: http://digg.com/news
# [*] searching for internal IP patterns ...
10.2.129.82
      #view-source: .. an title="10.2.129.82 build: 210 - fri mar 11 1

# [*] 1 internal IP(s) found!

# [*] searching for IP/domain patterns ...
cdn1.diggstatic.com
      #view-source: .. f="http://cdn1.diggstatic.com/img/favicon.a015f25c.ico">
cdn2.diggstatic.com
      #view-source: .. f="http://cdn2.diggstatic.com/css/two_column/library/global.
ad.doubleclick.net
      #view-source: .. = 'http://ad.doubleclick.net/adj/dgg.tn/home_tn_uprrail1;pt=
cdn4.diggstatic.com
      #view-source: .. c="http://cdn4.diggstatic.com/story/ipad_2_first_video_look_
cdn3.diggstatic.com
      #view-source: .. c="http://cdn3.diggstatic.com/story/more_aftershocks_explosi
dads.new.digg.com
      #view-source: .. , 'http://dads.new.digg.com', 'kw=pos:3&kw=topics:%2a&kw=pag
about.digg.com
      #view-source: .. f="http://about.digg.com/blog/breaking-breaking-news" class=
[...TRUNCATED...]

Another example for lemonde.fr that makes use of amazonaws (Amazon Web Services):

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:02:17
# --------------------------------------------------------------------------
# URL: http://www.lemonde.fr/
# [*] searching for internal IP patterns ...

# [x] no internal IP(s) found

# [*] searching for IP/domain patterns ...
monde.fr
      #view-source: .. <title>le monde.fr : actualité à la une</ti
medias.lemonde.fr
      #view-source: .. f="http://medias.lemonde.fr/medias/info/favicon.ico" />
[...TRUNCATED...]
google-analytics.com
      #view-source: .. www') + '.google-analytics.com/ga.js';
s3.amazonaws.com
      #view-source: ..  "https://s3.amazonaws.com/" : "http://") +

Local IP addresses

The following example emphasizes the presence of a local IP address:

$ ruby host-extract.rb -a http://allrecipes.com
==================================================================
IP/Host Pattern Extractor (c) Aung Khant, aungkhant[at]yehg.net
  YGN Ethical Hacker Group, Myanmar, http://yehg.net/
svn co http://host-extract.googlecode.com/svn/trunk/ host-extract
==================================================================
Target: http://allrecipes.com
host: allrecipes.com
path: /

[*] searching for internal IP patterns ...
  - 192.168.5.143
[*] 1 internal IP(s) found!

[*] searching for IP/domain patterns ...
  - 192.168.5.143
  - www.icra.org
  - allrecipes.com
  - images.media-allrecipes.com
  - secure.allrecipes.com
  - www.facebook.com
  - twitter.com
  - ad.doubleclick.net
  - bs.serving-sys.com
  - hostedjobs.openhire.com
  - mantestedrecipes.com
  - www.tasteofhome.com
  - www.rachaelraymag.com
  - www.rd.com
  - www.mozilla.org
  - allrecipes.cn
  - allrecipes.fr
  - allrecipes.de
  - allrecipes.jp
  - allrecipes.co.uk
  - metric.allrecipes.com
  - an.tacoda.net
[*] total IP/Host pattern(s): 22

# Send bugs & suggestions to host-extract @ yehg.net

Another local IP address disclosure in the source code:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:02:51
# --------------------------------------------------------------------------
# URL: http://tinypic.com
# [*] searching for internal IP patterns ...
10.2.253.1
      #view-source: .. <!-- 10.2.253.1 -->

# [*] 1 internal IP(s) found!

# [*] searching for IP/domain patterns ...
static.tinypic.com
      #view-source: .. f="http://static.tinypic.com/s/global_v4.3.28.css" type="tex

Another example for meneycontrol.com, which reveals the loopback address:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:00:45
# --------------------------------------------------------------------------
# URL: http://moneycontrol.com
# -> http://moneycontrol.com | 301
# (Redirected to : http://www.moneycontrol.com/?)

#########################################################################
# URL: http://www.moneycontrol.com/?
# [*] searching for internal IP patterns ...
127.0.0.1
      #view-source: .. ef=http://127.0.0.1/mccode/markets/homebody.php' />    <!-- 

# [*] 1 internal IP(s) found!

# [*] searching for IP/domain patterns ...
www.moneycontrol.com
      #view-source: .. rl=http://www.moneycontrol.com">
moneycontrol.com
      #view-source: .. ttp://www.moneycontrol.com">
[...TRUNCATED...]
127.0.0.1
      #view-source: .. ef=http://127.0.0.1/mccode/markets/homebody.php' />    <!--

External IP addresses

The following example shows external IP addresses

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:01:19
# --------------------------------------------------------------------------
# URL: http://ovh.net/
# [*] searching for internal IP patterns ...

# [x] no internal IP(s) found

# [*] searching for IP/domain patterns ...
ovh.net
      #view-source: .. <title>ovh.net</title>
www.ripe.net
      #view-source: .. f="http://www.ripe.net/perl/whois?form_type=simple&full_quer
www.renater.fr
      #view-source: .. f="http://www.renater.fr/sfinx/" target="_blank">sfinx</a> (
194.68.129.144
      #view-source: .. finx</a> (194.68.129.144)</b></td><td><small>(1x1gbps)</smal
www.freeix.net
      #view-source: .. f="http://www.freeix.net/" target="_blank">freeix</a> (213.2
213.228.3.225
      #view-source: .. eeix</a> (213.228.3.225)</b></td><td><small>(1x1gbps)</small
www.ovh.com
      #view-source: .. f="http://www.ovh.com" target="_blank" title="dépôt de 
ovh.com
      #view-source: .. ttp://www.ovh.com" target="_blank" title="dépôt de domaines

Sub domains

The following example reveals sub domains:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:00:45
# --------------------------------------------------------------------------
# URL: http://moneycontrol.com
# -> http://moneycontrol.com | 301
# (Redirected to : http://www.moneycontrol.com/?)

#########################################################################
# URL: http://www.moneycontrol.com/?
# [*] searching for internal IP patterns ...
127.0.0.1
      #view-source: .. ef=http://127.0.0.1/mccode/markets/homebody.php' />    <!-- 

# [*] 1 internal IP(s) found!

# [*] searching for IP/domain patterns ...
www.moneycontrol.com
      #view-source: .. rl=http://www.moneycontrol.com">
moneycontrol.com
      #view-source: .. ttp://www.moneycontrol.com">
img1.moneycontrol.com
      #view-source: .. c="http://img1.moneycontrol.com/images/ad/xerox/swfobject.js
stat1.moneycontrol.com
      #view-source: .. c="http://stat1.moneycontrol.com/mcjs/common/relona_script_0
mmb.moneycontrol.com
      #view-source: .. n('http://mmb.moneycontrol.com/india/messageboard/po
www.macromedia.com
      #view-source: .. e='http://www.macromedia.com/go/getflashplayer' type='applic
im.in.com
      #view-source: .. rl(http://im.in.com/connect/images/enhanced_google.gif) no-r
stat2.moneycontrol.com
      #view-source: .. f="http://stat2.moneycontrol.com/mccss/revamp/mc20
[...TRUNCATED...]

Development language

The following example run against symantec.com reveals the presence of Java programming language (jsp scripts).

$ ruby host-extract.rb http://www.symantec.com
==================================================================
IP/Host Pattern Extractor (c) Aung Khant, aungkhant[at]yehg.net
  YGN Ethical Hacker Group, Myanmar, http://yehg.net/
svn co http://host-extract.googlecode.com/svn/trunk/ host-extract
==================================================================
Target: http://www.symantec.com
host: www.symantec.com
path: /
-> http://www.symantec.com | 301
(Redirected to : http://www.symantec.com/index.jsp?)

host: www.symantec.com
path: /index.jsp

[*] searching for internal IP patterns ...
[x] no internal IP(s) found

# Send bugs & suggestions to host-extract @ yehg.net

Use of dropbox

The following example emphasizes a usage of dropbox to host PHP scripts:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 09:00:32
# --------------------------------------------------------------------------
# URL: http://www.yousendit.com
# [*] searching for internal IP patterns ...
192.168.40.30
      #view-source: ..  192.168.40.30 </div>^M

# [*] 1 internal IP(s) found!

# [*] searching for IP/domain patterns ...
yousendit.com
      #view-source: .. domain = "yousendit.com";^M
blog.yousendit.com
      #view-source: .. f="http://blog.yousendit.com" target="_blank">blog</a
ftf-641.yousendit.com
      #view-source: .. = "http://ftf-641.yousendit.com";
ftf.yousendit.com
      #view-source: .. =='http://ftf.yousendit.com') {
dropbox.yousendit.com
      #view-source: .. = "http://dropbox.yousendit.com/transfer.php?action=dropb
192.168.40.30
      #view-source: ..  192.168.40.30 </div>^M

# [*] total IP/Host pattern(s): 6

Architecture

This example discloses the development server:

==================================================================
IP/Host Pattern Extractor (c) Aung Khant, aungkhant[at]yehg.net
  YGN Ethical Hacker Group, Myanmar, http://yehg.net/
svn co http://host-extract.googlecode.com/svn/trunk/ host-extract
==================================================================
Target: http://gawker.com
host: gawker.com
path: /

[*] searching for internal IP patterns ...
[x] no internal IP(s) found
[*] searching for IP/domain patterns ...

[...TRUNCATED...]

  - dev.gawker.com:8888
      #view-source: .. 		'host' : 'dev.gawker.com:8888',

[*] total IP/Host pattern(s): 22

# Send bugs & suggestions to host-extract @ yehg.net

Another example run against gazeta.pl:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 11:36:54
# --------------------------------------------------------------------------
# URL: http://www.gazeta.pl/
# -> http://www.gazeta.pl/ | 301
# (Redirected to : http://www.gazeta.pl/0,0.html?)

#########################################################################
# URL: http://www.gazeta.pl/0,0.html?
# [*] searching for internal IP patterns ...

# [x] no internal IP(s) found

# [*] searching for IP/domain patterns ...
google-analytics.com
      #view-source: ..  '.google-analytics.com/ga.js';^M
jedynka:8130
      #view-source: ..  <!-- iw10jedynka:8130 hp 30 -->    ^M
www.booking.com
      #view-source: .. f="http://www.booking.com/index.html?aid=332229&lang=pl&labe
clk.tradedoubler.com
      #view-source: .. p%3a%2f%2fclk.tradedoubler.com%2fclick%3fp%3d203575%26a%3d16
www.ciacha.net
      #view-source: .. f="http://www.ciacha.net/ciacha/0,0.html">ciacha</a></li><li
www.pracownicy.it
      #view-source: .. f="http://www.pracownicy.it/">pracownicy it</a></li><li clas

# [*] total IP/Host pattern(s): 6

Another example run against a Javascript hosted on McAfee website, about Omniture:

# Generated by host-extract (c) Aung Khant, http://yehg.net/lab/
# Send bugs/suggestions to host-extract @ yehg.net
# Date: 2011-03-13 11:36:59
# --------------------------------------------------------------------------
# URL: http://www.mcafee.com/js/omniture/omniture_profile.js
# [*] searching for internal IP patterns ...
172.31.30.227
      #view-source: .. nname == "172.31.30.227" || domainname == "172.3
172.31.30.226
      #view-source: .. nname == "172.31.30.226" || domainname == "daldevwebcms3:860

# [*] 2 internal IP(s) found!

# [*] searching for IP/domain patterns ...
www.mcafee.com
      #view-source: .. nname == "www.mcafee.com" || domainname == "mcaf
mcafee.com
      #view-source: .. e == "www.mcafee.com" || domainname == "mcafee.com" 
secure.nai.com
      #view-source: .. nname == "secure.nai.com" || domainname == "vil.nai.com" || 
vil.nai.com
      #view-source: .. nname == "vil.nai.com" || domainname == "www.foundstone.com"
www.foundstone.com
      #view-source: .. nname == "www.foundstone.com" || domainname == "secure.mcafe
secure.mcafee.com
      #view-source: .. nname == "secure.mcafee.com")
internal.nai.com
      #view-source: .. nname == "internal.nai.com")
161.69.217.15
      #view-source: .. nname == "161.69.217.15" || 
161.69.202.116
      #view-source: .. nname == "161.69.202.116" || domainname == "161.69.202.117:8
dalqawebcms1:8600
      #view-source: .. nname == "dalqawebcms1:8600" || domainname == "dalqawebcms2:
dalqawebcms2:8600
      #view-source: .. nname == "dalqawebcms2:8600" || 
sncstgwebcms1:8600
      #view-source: .. nname == "sncstgwebcms1:8600" || domainname =
sncstgwebcms2:8600
      #view-source: .. nname == "sncstgwebcms2:8600" || domainname == "sncstgwebcms
172.31.30.227
      #view-source: .. nname == "172.31.30.227" || domainname == "172.3
172.31.30.226
      #view-source: .. nname == "172.31.30.226" || domainname == "daldevwebcms3:860
daldevwebcms3:8600
      #view-source: .. nname == "daldevwebcms3:8600" ||
daldevwebcms3:8089
      #view-source: .. nname == "daldevwebcms3:8089" || domainname ==
sncwebdevpview1:82
      #view-source: .. nname == "sncwebdevpview1:82" || 
sncwebdevpview1:444
      #view-source: .. nname == "sncwebdevpview1:444" || domainname =
sncqawebpview1:440
      #view-source: .. nname == "sncqawebpview1:440" || 
sncwebdevpview1:81
      #view-source: .. nname == "sncwebdevpview1:81" || domainname ==
vil.qa.nai.com
      #view-source: .. nname == "vil.qa.nai.com" || domainname == "sncwebdevpview1:
sncwebdevpview1:8084
      #view-source: .. nname == "sncwebdevpview1:8084" || 
daldevwebcms4:8600
      #view-source: .. nname == "daldevwebcms4:8600" ||domainname == 
devinternal.na.nai.com
      #view-source: .. nname == "devinternal.na.nai.com" || domainname
dalqawebcms1:8080
      #view-source: .. nname == "dalqawebcms1:8080" || domainname == "sncwwwprod1.p
sncwwwprod1.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod1.prod.mcafee.com"
sncwwwprod2.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod2.prod.mcafee.com" || domainna
sncwwwprod3.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod3.prod.mcafee.com" || domainname == "snc
sncwwwprod4.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod4.prod.mcafee.com"
sncwwwprod5.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod5.prod.mcafee.com" || domainna
sncwwwprod6.prod.mcafee.com
      #view-source: .. nname == "sncwwwprod6.prod.mcafee.com" || domainname == "snc
sncwebcms1:8600
      #view-source: .. nname == "sncwebcms1:8600"
161.69.207.22
      #view-source: .. nname == "161.69.207.22" || domainname == "161.69.
161.69.207.23
      #view-source: .. nname == "161.69.207.23" ||domainname == "161.69.207.22:8600
sncwebcms2:8600
      #view-source: .. nname == "sncwebcms2:8600" 
searchmcafee.mcafee.com
      #view-source: .. nname == "searchmcafee.mcafee.com" || domainname =
phoenix-beta.mcafee.com
      #view-source: .. nname == "phoenix-beta.mcafee.com" || domainname == "dalqawe
eloqua.com
      #view-source: .. nname == "eloqua.com" || domainname == "daldevwebcms3:9696"
daldevwebcms3:9696
      #view-source: .. nname == "daldevwebcms3:9696"
phoenix.qa.nai.com
      #view-source: .. nname == "phoenix.qa.nai.com"||domainname == "sph
sphoenix.qa.nai.com
      #view-source: .. nname == "sphoenix.qa.nai.com"
phoenix.qa.nai.com:8600
      #view-source: .. nname == "phoenix.qa.nai.com:8600"||domainname ==
sphoenix.qa.nai.com:8443
      #view-source: .. nname == "sphoenix.qa.nai.com:8443"
sphoenix-uat.qa.nai.com
      #view-source: .. nname == "sphoenix-uat.qa.nai.com"||domainname ==
phoenix-uat.qa.nai.com
      #view-source: .. name == "sphoenix-uat.qa.nai.com"||domainname == "
sphoenix-uat.qa.nai.com:8600
      #view-source: .. nname == "sphoenix-uat.qa.nai.com:8600"||domainna
phoenix-uat.qa.nai.com:8443
      #view-source: .. nname == "phoenix-uat.qa.nai.com:8443" 
phoenix.corp.nai.org
      #view-source: .. nname == "phoenix.corp.nai.org"||domainname == "s
sphoenix.corp.nai.org
      #view-source: .. nname == "sphoenix.corp.nai.org"
phoenix.corp.nai.org:8600
      #view-source: .. nname == "phoenix.corp.nai.org:8600"||domainname 
sphoenix.corp.nai.org:8443
      #view-source: .. nname == "sphoenix.corp.nai.org:8443"
phoenix.dev.nai.com
      #view-source: .. nname == "phoenix.dev.nai.com"||domainname == "sp

# [*] total IP/Host pattern(s): 53

Comments