What does netloc mean?

18,558

Solution 1

From RFC 1808, Section 2.1, every URL should follow a specific format:

<scheme>://<netloc>/<path>;<params>?<query>#<fragment>

Lets break this format down syntactically:

  • scheme: The protocol name, usually http/https
  • netloc: Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password. Together it may take form of username:[email protected]:80.
  • path: Contains information on how the specified resource needs to be accessed.
  • params: Element which adds fine tuning to path. (optional)
  • query: Another element adding fine grained access to the path in consideration. (optional)
  • fragment: Contains bits of information of the resource being accessed within the path. (optional)

Lets take a very simple example to understand the above clearly:

https://cat.com/list;meow?breed=siberian#pawsize

In the above example:

  • https is the scheme (first element of a URL)
  • cat.com is the netloc (sits between the scheme and path)
  • /list is the path (between the netloc and params)
  • meow is the param (sits between path and query)
  • breed=siberian is the query (between the fragment and params)
  • pawsize is the fragment (last element of a URL)

This can be replicated programmatically using Python's urllib.parse.urlparse:

>>> import urllib.parse
>>> url ='https://cat.com/list;meow?breed=siberian#pawsize'
>>> urllib.parse.urlparse(url)
ParseResult(scheme='https', netloc='cat.com', path='/list', params='meow', query='breed=siberian', fragment='pawsize')

Now coming to your code, the if statement checks whether or not the next_page exists and whether the next_page has a netloc. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative url. A relative url has a path but no hostname (and thus no netloc). ;)

Solution 2

import urllib.parse
url="https://google.com/something?a=1&b=1"
o = urllib.parse.urlsplit(url)
print(o.netloc)

google.com

Share:
18,558

Related videos on Youtube

Tri
Author by

Tri

Every Soul Will Taste Death (Q.S. 3:185)

Updated on June 04, 2022

Comments

  • Tri
    Tri almost 2 years

    I'm learning to make login function with Flask-login, and I'm facing with this code in my tutorial that I'm following:

    @app.route('/login', methods = ['GET', 'POST'])
    def login():
        if current_user.is_authenticated:
            return redirect(url_for('index'))
        form = LoginForm()
        if form.validate_on_submit():
            user = User.query.filter_by(username=form.username.data).first()
            if user is None or not user.check_password(form.password.data):
                flash('Invalid username or password')
                return redirect(url_for('login'))
            login_user(user, remember=form.remember_me.data)
            next_page = request.args.get('next')
            if not next_page or url_parse(next_page).netloc != '': # what is it means in this line..?
                next_page = url_for('index')
            return redirect(next_page)
        return render_template('login.html', title='Sign In', form=form)
    

    But I'm not sure what's the code above that I commented means..?, especially in netloc word, what is that..?, I know that is stand for network locality, but what is the purpose on that line..?

    • Paul Rooney
      Paul Rooney over 5 years
      Although the function you are calling is from werkzeug. You can probably look to the standard library for the definition of netloc. See urllib.parse.urlparse. netloc is the name of the server (ip address or host name).
  • cowlinator
    cowlinator almost 5 years
    In RFC 1808 Section 2.1, net_loc stands for network location, and represents: the (optional) login information, the hostname, and the (optional) port number. According to RFC 1738 Section 3.1, this must take the form <user>:<password>@<host>:<port>. This is consistent with what Python 3's documentation on urllib.parse.urlparse's ParseResult.netloc states. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative url. A relative url has a path but no hostname (and thus no netloc)
  • Agent Zebra
    Agent Zebra almost 5 years
    Any idea why it's called a netloc?
  • augurar
    augurar almost 4 years
    @AgentZebra See previous comment, it's a contraction of network location