Is your site too deep? Bad designed link structure? Test it!

One of the things I look at when I get a fresh new site to analyse is the linkstructure. All sites, and I repeat: all sites, should have a basic structure that looks like a simple tree diagram:

treestructure

All pages link to one or more pages below them. This is the basis of your site's structure. I don't say it's the only structure, you could also interlink between different categories or products, but it's the main structure. It has the following advantages:

  • It's easy to check if all pages can be reached through this link structure
  • You can and must use breadcrumbs to link to the above pages:
    "Home" -> "Category A" -> "Product 1"
    This gives searchengines the possibility to enter a site anywhere and just by following the breadcrumbs upwards and normal links downwards be able to crawl everything. And it gives users a 'sense of place' in your site.
  • This structure can also be represented in a page's URL: "/home/category-a/product-1", although that is not really neccesary. If the URL's become too long I would leave out some levels.

So let's say your site has about 80.000 pages. By creating a good structure with (for example) exactly 10 links on every page to underlying pages you are able to fit every page in a 6 levels deep site:

goodstructure

Level 1 is the homepage that links to 10 category pages, those link to 100 sub-category pages who link to 1000 product pages, etc. All pages can be reached in 5 clicks from your homepage.

How to create this for your own site

First you need a tool that can crawl your site, I use the Screaming Frog SEO spider to do this for me. This tool generates a report that gives you all the pages it could find by simply following the HTML links from your homepage, just like the search engines do. Export the report as a CSV file with this button:

frogexport

Before you can use this CSV file you have to do some cleanup work. Open the file in an advanced text-editor (I use Notepad++). Find the search and replace option and replace all semicolons for nothing (Excel doesn't handle this character as we would like too). Next step is to replace all newlines in meta-descriptions for spaces, otherwise our file will look like this:

"http://andrescholten.net/page1", "Description
with a newline"
"http://andrescholten.net/page2", "Good description"
etc...

And it should look like this:

"http://andrescholten.net/page1", "Description with a newline"
"http://andrescholten.net/page2", "Good description"
etc...

I do this with an advanced regular expression that removes all this unwanted newlines:

searchreplace

The replace field is empty on purpose. Save the CSV and open it in Excel. Excel has this function to split the Comma Separated Values (CSV) into columns:

texttocolumns

When everything went right you should have a column called "Level". Select it, create a new graph, and see something like this:

badstructure

And as you can probably see: the above example is the graph of a site with a very bad link structure. It has only 41.000 pages, but 31 levels to reach them all...and most pages are around 8 to 9 click/levels away from the homepage. This site clearly needs an extra, or better: a new basic linkstructure. Right now it depends on pages that links to other related pages. The problems:

  • we don't know if all pages can be reached like this, we should check the CMS to see how many pages there should be and make sure they are all linked
  • pages 31 clicks away from the homepage are really not important to any searchengine
  • the site doesn't really has a structure, so for a visitor it's difficult to navigate through it, or to find a specific page back again

So, how does your site look graphed like this?

Click to activate social bookmarks

 
  • http://www.facebook.com/svalk Sander Valk

    Hi André. Dank voor dit artikel. Bij mij doet echter het linkje naar Screaming Frog het niet. Verder gebruik je een voorbeeld maar wat is volgens google nu een optimaal gemiddelde (aantal levels)? Wij hebben ongeveer 400.000 pagina's, binnen hoeveel clicks zou je elke pagina dan moeten kunnen bereiken, is dit dan ook 5? Heb je tips over wat we kunnen aanhouden?

    • http://andrescholten.net/ André Scholten

      De URL van de link klopt, hij zou moeten werken?
      Er is geen optimaal aantal levels, maar als je zoals in mijn voorbeeld op elke pagina 10 links plaatst kun je na 5 click dus al 100.000 pagina's bereiken. Je kunt er dan voorkiezen om nog een level dieper te gaan zodat je 1.000.000 pagina's kunt bereiken, of simpelweg het aantal links per pagina omhoog doen naar bijvoorbeeld 20. Het aantal clicks is dus niet heel belangrijk, als er maar een goede structuur in de site zit.

  • http://twitter.com/mhoving Martijn Hoving

    It looks like this (it seems wel structured, but lacks level 1 and 2 pages).

    • http://andrescholten.net/ André Scholten

      Level 1 is the homepage, but it looks like a splash page with 1 link to a homepage on level 2? Most incoming links will point to your homepage, so you should really move pages up a few levels ;)

      • http://twitter.com/mhoving Martijn Hoving

        I would like that too :) This graph is another way to persuade our client to redesign (and restructure). Thanks for sharing, André.

  • http://twitter.com/MarcoSchuurman Marco Schuurman

    When a website has a (rich) menu with submenu item's you often get something like this. The menu including subitems has 78 links, what do you think about that?

    • http://andrescholten.net/ André Scholten

      Looks like a great example of a good site structure on first sight. 100 pages on level 1 and 1000 on the next level.

    • http://www.pure-escort.com/ Kay

      This does look good to me!

  • jskooij

    Thanks andre, specially for the tip in excell. I have used screaming frog before, but as always you can do magic with it :)

    Oops... i thought i give it a try... but what is: replace all semicolons for nothing
    What are the semicolons?

    • Sivand

      Also had problems with the semicolons..

  • Jambonbuzz

    Hello, good article. Why don't you use the import feature in Excel Data > From Text then select CSV + UTF 8 with the separator you want. Works perfectly here

  • Marc Rudisuhli

    Thanks Andre, your regex didn't work for me but this one worked:

    [^,;]+rn.
    Hope that help...

  • Marc Rudisuhli

    I see in my precedent comment I forgot to say that user need to use Notepad++ version 6.3.2 (or higher) and find & replace the following:

    [^,;]+rn
    Hope this is more clear ;-)

  • http://twitter.com/MademoisellePan Monika Faseth

    It is much easier when you use Open Office Calc for importing the Screaming Frog data. It doesn't mess with the columns and you don't need to do any enhancements of the imported data.

  • http://www.bestofamsterdam.com/nl/ Sandra

    Great, thanks for the overview!

  • http://www.internetaanbieders.info/ Tim

    Can't wait to find out!

  • Ronald B

    Wat een top artikel! Ben al druk bezig maar ik kom er niet helemaal uit. Ik loop vast bij het maken van een grafiek. Op dit moment heb ik netjes een internal_all.cvs en als ik deze open in Excel ga ik naar het kopje ; levels' en die telt 21.000 punten.

    Hoe zorg ik nou dat als ik die Level kolom selecteer dat er netjes op de x-as de levels komen en op de y-as de hoeveelheid? Want ik krijg 21.000 punten op de x-as of y-as.

    Het liefst stap voor stap :) Want het is echt een uitstekend artikel!!!

    • http://andrescholten.net/ André Scholten

      Ik zie nu dat ik een stap mis in het artikel. Je moet op een 2de blad 30 rijen maken met de nummers 0-30 er in. En dan in de cel er naast: =AANTAL.ALS(Blad1!L:L; A1) waarbij Blad1 je blad is met alle data en de L de kolom waar de levels in staan. A1 verwijst dan naar de net aangemaakte cellen van 0-30. Lukt het zo?

  • Mark

    Interessant artikel nooit zo over nagedacht eigenlijk. Maar ga er zeker mee aan de slag. Waarschijnlijk met mijn http://weightcaps.com/nl/afvallen-met-weightcaps zijn er veel te veel pagina's.

  • MichelK

    Hi André,

    There's an easier way to load the data into Excel. With Excel open:

    1. type something in cell A1
    2. Select column A
    3. Go to text to columns and make it seperate on comma's
    4. find the screaming frog output and open this with notepad
    5. copy the data
    6. paste the data into cell A1
    7. Voila!

    Thanks for the interesting insights.