Everything You Wanted to Know About URLs (And Best Practices for their Optimization)

URLs are important in the digital marketing industry, and especially in the field of SEO. URLs are the address of our online home – our website, which we want to rank well for relevant queries. They are the face we show to visitors that we want to convert.

Due to their crucial role, we at SEOptimer always place a large focus on formatting URLs in such way that they meet the highest optimization standards. As you may have noticed, we’ve also built checks into our Audit Tool report that review these factors on your site.

Over the years, we’ve received numerous questions related to URL optimization and URL terminology. For this reason, we’ve decided to create a comprehensive guide covering the most frequent questions, intended to be an all-inclusive resource.

Defining URLs

A Uniform Resource Locator represents an address of a unique resource on the Web. In web browsers, the URL of a specific web page is displayed in the address bar at the top.

In theory, each URL points to one unique resource, which can include an HTML page, an image, a CSS document, etc. They are most commonly used to reference web pages (http/https), but also emails (mailto), file transfer (ftp), database access (JDBC), and many more.

URL vs. link

Before we get much further, we should distinguish the difference between a URL and a link although the two terms are often used interchangeably, it’s important to point out that a link is actually just a clickable snippet of text on a page associated with the URL it takes you to.

URL structure

It’s useful to consider an URL as nothing more than a regular postal mail address – check out the image below:

url building

Now, let’s get more technical and break down the different URL segments. Each URL consists of mandatory and optional parts:

Mandatory URL parts:

1.Protocol – HTTP/HTTPS

This is the first part of a URL which indicates the protocol a web browser has to use, and it represents a defined method for transferring or exchanging data via a computer network. Historically, http was the most commonly used protocol, but due to search engine algorithm changes and new industry demands, the more secure version called https dominates the web now. Aside from the two, mailto: for opening a mail client and ftp for handling file transfers can also appear in your address bar.

2. Domain name – www.seoptimer.com

Domain names are an entirely separate science that we’ll delve into in another blog post, but in summary, they are a name reference to a web server that serves webpages. A domain name is mapped to a server using a server’s IP address and the Domain Name System (DNS). There are a huge variety of domains names that you can choose for a site, but in general, it’s better to keep them short and memorable.

* Subdomain – www

A subdomain is a subset of a larger domain. It is displayed before the first dot (.) in a domain name, which in this (and the majority of cases) is www. However, it is possible to use any word to create a unique web address without having to change your domain name. A subdomain is an extension of a registered domain name and it allows you to send website visitors to a different web address or point to a specific web address or directories in your hosting account. The majority of webmasters use subdomains to organize their websites into sections according to specific niches, but you can also take advantage of a certain subdomain to separate blog or eCommerce site from the main one, and to develop a separate website for mobile devices, etc.

3. Path to file – /seo-services/

In the past, a path indicated a physical file location on the Web Server. Now, we can say that paths are an abstraction handled by web servers, that point to particular pieces of content.

Optional URL parts

To better analyze the parts of a URL that are usually omitted, we’ll provide the following example:

url anchors

1. TLD – .com

A TLD, or a domain extension, follows the last dot (.) in a domain name. .com takes the lead with more than 80 million extensions worldwide, while in general, the top 10 extensions are the country-specific TLDs, such as .DE (Germany), .CN (China), .UK, .US and others.

So how do you choose the domain extension that will benefit your online branding efforts the most? Let’s take a brief look at some of the most popular domain extensions and their characteristics:

  • .com – Although it was originally derived from the word ‘commercial’, and focused on ‘for-profit businesses’, today it is widely used by both personal users and businesses of all sizes, niches and types.
  • .net – Similar to .com, this TLD is available for everyone. Initially, it was reserved for networks and internet service providers, but now it is a great .com alternative, especially for application- and tech-based companies, since many users associate it with technology.
  • .org – At first this TLD was used by non-profit organizations, but today it is a popular TLD for many non-governmental organizations, politicians and political parties.
  • .gov – This particular domain extension is reserved solely for government agencies.
  • .edu – Only educational institutions are able to acquire the .edu domain extension.
  • .info – Short for ‘information’, .info is an open TLD available for all users.
  • .xyz – Like .info, it is available for general use.
  • .ly – Although it is a Libyan country code, we’ve seen numerous startups taking advantage of .ly to create creative, catchy and punny domain names.
  • Many more – Recently, there have been a great number of new domain types to hit the scene such as .accountants and .technology. We would generally recommend steering to the main ones, but sometimes these can make a good alternative when the .com’s are not available.

2. Port – 443

A port is a part of a URL that is usually omitted. It can be called a gate that is used to access the resource on the web server, and it differs based on the protocol that is used (for instance, port 443 is for https, and 80 for http). You can choose to explicitly include the port, but if it is not included, it will default based on the protocol chosen (http/https etc).

3. File Name – index.html

Historically, most websites would be shown by referring to specific file names. These days, the actual files being read by a web server are abstracted from users, so it is less and less frequent (and less recommended) to show filenames.

4. Parameters – key1=value1&key2=value2

Parameters are a list of value/key pairs that are separated in a URL with “&” symbol. They are used by web servers to complete certain actions before returning to the source. Often pages that include things such as contact forms, may use these parameters to send details to another page.

5. Anchor – #aspecificpointinthedocument

Using “#” character, you can point to a specific part of the webpage. In a way, it enables you to bookmark a certain part of the document, giving your browser directions where exactly on the page should users land.

6. Trailing slash – /

Trailing, or a forward slash, is used in URL to mark a directory, and it is placed at its very end. The absence of a trailing slash used to mean that that URL takes you to a file, not a directory, but this is no longer the case.

If a webpage can be opened with more than one or no trailing slashes, Google may consider multiple indexing of the same webpage as duplicate content and mark it as spam. Furthermore, multiple indexing can result in the division of incoming traffic, which additionally affects your rankings.

Whether or not you will be using a trailing slash is not important, as long as you’re consistent with it. You can use rel=“canonical” or 301 redirects to direct users and search engines to the preferred version.

Characters used for structuring URL

There tends to be a lot of confusion about which characters can be used in an URL. Below, we differentiate:

  • Reserved set of characters

This is the most commonly used set. Each character carries a specific meaning to web servers:

;    |    /    |   ?    |   :    |    @    |    &    |    =    |    +     |    $    |    ,

  • Unreserved set of characters

This set of characters doesn’t need to be encoded/escaped when included as a separate part of a URL:

    |    _     |    .    |    !    |    ~    |    *    |        | (    |    )

  • Unwise set of characters

As the name suggests, it is not wise to use them in a URL, since they can be modified by gateways or used as delimiters in some cases:

{    |    }    |    |    |    \    |    ^    |    [    |    ]    |    `

  • Excluded set of characters

The excluded set is made up of all ASCII control characters, a space character and the following group of characters:

<    |    >    |    #    |    %    |   

These should be escaped, except for some which can have a special meaning in a specific context (for instance, you may have noticed that we’ve already mentioned “#” when we talked about URL parts, where this character’s purpose is to refer to a specific location in a document).

URL categories – Absolute and relative URLs

Absolute URLs contains all the information necessary to locate a certain resource (though protocol and port are optional elements and don’t need to be included).

When it comes to relative URLs (which would in our case be /seo-services), as their name indicates, they are always interpreted as relative to another URL, known as base URL. They are most commonly found in HTML documents, and to convert a relative URL into its absolute form. It can be explicitly specified in the document using HTML <base> tag, and if not, then we treat the URL of the document as the base for that relative URL.

URL types

URLs are classified as dynamic and static, the latter being more SEO-friendly.

Dynamic URLs

This type of a URL is often a result of a search, product, or category page retrieved from a database.

They are considered less search engine- and user-friendly, as they tend to contain unintelligible characters which make it difficult to understand the purpose of a page. As such, dynamic URLs don’t get indexed as quickly as static ones and can be considered spammy.

Static URLs

When a web page has a static URL, it means that it’s content is generally fixed. Static URLs contain mostly alphanumeric characters and usually include relevant keywords, thus indicating what the content of the page is about.

Static pages tend to get indexed faster, rank higher and enjoy a greater CTR. 

Just think about it – which of the two would you prefer:

https://www.seoptimer.com/seo-services

or

https://www.seoptimer.com/index.php?page=seo/services

HTTP vs HTTPS

In August 2014, Google officially announced that transferring a website from HTTP to HTTPS protocol would produce ranking benefits.

So, what’s the difference between the two? They are both hypertext transfer protocols, right?

True, but as search engines are working in an effort to improve user experience, it was decided to add an additional line of security (HTTPS).

But let’s explain both protocols in greater detail so you can understand why Google decided to mark HTTP websites as unsecure and include SSL as one of its ranking factors.

  • HTTP – Hypertext Transfer protocol represents a system for transmitting and receiving information across the Internet. However, as it is an application layer protocol, its focus is merely on a successful presentation of information to the user, not on how the data is transferred. This means that it doesn’t pay extra attention to the security of your data.

  • HTTPS – Secure Hypertext Transfer Protocol was developed to allow authorization and secured transactions. It works in conjunction with Secure Sockets Layer (SSL) protocol to transport your data safely. Taking that into consideration, it’s no wonder Google uses HTTPS as a ranking signal and prefers websites which encrypt user data for an additional layer of security.

While HTTPS and SSL have been used interchangeably, they are not the same thing. SSL represents the standard security technology which establishes an encrypted link between a browser and a web server. HTTPS is a secure protocol because it utilizes this technology.

However, it is no longer just another ranking signal. Effective July 2018, Google’s Chrome browser will mark all sites which don’t implement SSL certificate not secure, which makes the transfer from HTTP to HTTPS an imperative. 

Status Codes

A status code can be referred to as an online conversation between the server and a browser. The information they send to one another revolves around the status of the request – to put it simply, the discussion is largely based on whether everything is OK or not. To be more technical – HTTP status code is a server’s response to a browser’s request.

There are several classes of status codes, but for now, we are going to focus on status codes that are crucial for SEO experts.

Status code 200

This is an ideal status code for a functioning page.

301 redirect

This status code refers to a permanent redirect to a different URL. A 301 redirect means that everything, from users and bots to link equity will be passed on. Nevertheless, note that not all redirects are treated equally, and the 301 still remains the preferred method.

302 redirect

Status code 302 is a temporary redirect and is, for instance, highly beneficial for eCommerce websites that wish to indicate an ongoing sale or discount offer. This status code is not recommended for permanent changes, as it doesn’t tell search engines to pass link equity.

Status code 404

404 error means that the page cannot be found by the server. However, it is uncertain whether it is because the page doesn’t exist, or it has been removed temporarily or permanently. Practices for handling 404 error are different – some decided to use 301 redirects and navigate users to the most relevant page, while others leave them.

Canonical tags

From an SEO perspective, canonical tags are similar to 301 redirects. Both indicate that multiple pages should be considered one page. Nevertheless, certain crucial differences need to be pointed out:

  • 301 redirects forward all traffic (bots and users) to a different URL, while canonical tags only redirect bots.
  • A 301 redirect can be applied across different domains, and canonical is applicable only within one website.
  • The 301 is a much stronger signal that multiple pages have a single source.

The reason you may opt for canonical instead of a 301 redirect is if you have an old branding element you want your users to see, but when it comes to search engines you want to indicate that out of the two pages, you would prefer the new one to be ranked.

Just like a 301 redirect, a canonical tag doesn’t pass on all link juice, but only some of it.

URL optimization – Best practices

Good URL structure allows users and search engines to connect the dots of your website logically and crawl the pages more easily. It can significantly improve user experience and enhance your SEO efforts. While we’ve already scratched the surface of this topic, there are many other practices that need to be implemented to help you reduce bounce rate, improve dwell time and accelerate your rankings.

English-like, user-friendly structure

We recommend that your site’s structure be strategically planned before you even commence your website setup. The most important principle is that an URL structure follows the navigation you’ve established, but here are a few additional points to keep in mind:

  • Make it logical. Simplicity will be appreciated both by users and search engines. First, create the most important categories, and then move on to subcategorizing in the way it seems most logical based on your website content.
  • Keep the number of categories as low as possible. This, of course, does not apply to large eCommerce websites that sell a variety of items, but even they have to be organized logically into separate categories. An average business or personal website should keep the number of categories somewhere between 2 and 7.
  • Create a shallow depth navigation structure. What this means is that none of your pages should be buried deep into the structure but should be easily accessed in no more than 2 or 3 logical clicks.

Sitelinks

Sitelinks are direct links to your sub-pages, shown in a Google search result. They are meant to help users navigate to website content faster. Google automatically generates these based on what they think will be most relevant to searchers. You can’t control these directly, but to maximize the quality of sitelinks that are shown, make sure any links within your site to internal pages have good Anchor or Alt text that’s informative and succinct.

Internal Linking

Internal links will help establish the hierarchy of the website, spread link juice and enable users to navigate the website with ease. Ensure anchors contain the same keywords as URLs of the pages you are linking to.

Keyword Mapping

This is an entire science of its own, but generally, you would want to optimize your pages, and subsequently their URLs to reflect keywords you want to rank for. You would determine these keywords through a detailed keyword research process to identify opportunities, then structure your page content and URLs to reflect these opportunities.

Use of keywords

Using relevant keywords (that you’ve chosen via a Keyword Mapping process) in URLs helps indicate both search engines and users what the page is about. There are three rules to live by:

  • Match the URL with the title of the page. Just to be clear, we don’t mean to aim for a 100% match, but just enough to indicate what the page is about. This matters, because most users simply glance at a URL to make assumptions regarding the content of the page. If it doesn’t meet their expectations, they decide to leave abruptly, leaving you with a higher bounce rate.
  • Target 1 or 2 keywords per URL. If a website is properly structured, that means that a single page revolves around a single product/service, or at least a group of characteristically-related products or services. In that case, you should be able to dedicate no more than two keywords that would describe what the page is about and would be included in the URL. Anything more than that could cause confusion with search engines, which will in return affect your rankings for relevant queries.
  • Don’t stuff your URLs. In the past, when Search Engines were simpler, website owners could try to ‘stuff’ particular keywords repeatedly into URLs, or other page elements to maximize rankings. This doesn’t cut it anymore, and you need to ensure that this content appears natural and not manipulative.

Use of characters

In order for a URL to be readable and easy-to-understand, you have to make a strategic choice of characters. A variety of strange symbols and numbers can confuse users and make the URL less memorable.

Take a look at the three URLs – which one do you trust the most?

  • https://www.seoptimer.com/seo-services
  • https://www.seoptimer.com/post?ID=77&kw=seo+services
  • https://cdn07.seoptimer.com/ post?ID=77&kw=seo+services

It’s not just strange characters that should be avoided though. Here are a few other guidelines:

  • Use alphanumeric characters where possible.
  • Skip punctuation and other unsafe characters. Certain characters are difficult to read and can make URL comprehension impossible. Check out the complete list on Perishable Press.
  • Use only lowercase characters. Although this doesn’t apply to Microsoft users, those who work with Linux/UNIX will land on the 404 page if they choose to capitalize certain characters in URL.
  • Stop words are not necessary. Stop words are the most common words in a language, and search engines are programmed to ignore them. As such, they can be excluded from URLs. Here you can find a complete list of stop words.
  • Use hyphens and not underscores. This reflects more advice provided by Google, as hyphens have been shown to be easier on the eye, and, whenever there’s an underscore between words, Google chooses to combine the two words as if they are written without a separator of any kind. To put it simply – hyphens help improve your SEO score.

URL length

Google advises its users to keep their URLs as simple and as intelligible to humans as possible. Shorter URLs are easier to remember and share, meaning that cutting down on their length may improve the promote-ability of your content. Once you go over 512 pixels (64 characters) Google will prune your URL in search results.

Redirects and canonicalization

Having duplicate content on your site can create problems with search engines. Thus, if you notice same or similar content pieces published on two different URLs, opt for either 301 redirect or rel=canonical as we’ve discussed previously. It is also highly advisable to use 301 redirects when transferring HTTP to HTTPS, as less than 5% of the top 10,000 websites currently redirect users automatically.

Secure URL structure

Although the primary purpose of SSL was to secure user data, this has since become a ranking signal, and as Chrome is about to rank all non-HTTPS sites as not secure, it will help build trust with users. With Chrome currently holding almost 60% of market share, not implementing SSL correctly is a problem you should avoid at all costs.

Conclusion

The information provided in this article has been collected from years of experience working with websites, and URLs specifically – helping our users make the most of their websites.

I hope you’ve enjoyed this guide, and if you have any additional questions, feel free to reach out to us via Livechat.