Posted on :: Min Read

Have you ever been frustrated by the amount of noise you have to wade through with git log -p -- <filepath>? What if there was a way to narrow down the diff output so that you could see how a single function or method evolved over time, and ignore all the other cruft that you don't care about?

As with most things git, there's an obscure and underutilized option in the docs that can do exactly that!

Inspiration

Let's take an example inspired from a real-life bug1 I was diagnosing that was ostensibly caused by an upgrade to Authlib.

Context

Authlib is an incredibly well written package; if you've ever had to implement part of an OAuth2 RFC spec, then you are familiar with that particular kind of pain. But, even Authlib developers gets things wrong from time to time, and code gets changed to conform to various specs, or is refactored for the sake of maintainability and interoperability.

One of those spec changes was causing our test suite to fail, and the report left me with a class and method name that needed to be investigated.

git log -L

To diagnose why the upgrade was causing problems, I needed to dive into the Authlib code to see the evolution of a particular method. To do this, we're going to use the rather obscure -L flag for git log2:

-L<start>,<end>:<file>, -L:<funcname>:<file>
    Trace the evolution of the line range given by <start>,<end>, or by
    the function name regex <funcname>, within the <file>. You may not
    give any pathspec limiters. This is currently limited to a walk starting
    from a single revision, i.e., you may only give zero or one positive revision
    arguments, and <start> and <end> (or <funcname>) must exist in the starting
    revision. You can specify this option more than once. Implies --patch.

As noted in the docs snippet, there are several ways that you can invoke this option. We're going to use the most straightforward variant, with -L:<funcname>:<file>, but to do that we first need to embark on a small side-quest.

Side Quest

Authlib is written in Python, and it turns out that we need to tell git that *.py files need to use the built-in Python regular expressions for diff hunk headers.

To do this, you'll need to create either a repository-level .gitattributes file, or a global/system wide one. Let's do a repository-level one for this example:

; source of authlib/.gitattributes
*.py diff=python

You'd think that git would have this very sane, very expected association configured by default, but you'd be wrong3.

Tracing Function Evolution in a File

Once the relevant .gitattributes entry has been added, you can then use the following invocation pattern to trace the evolution of funcname in a file:

$ git log -L :<funcname>:<path/to/file>

In our example, I want to know how the authenticate_token method of the RevocationEndpoint class in the authlib/oauth2/rfc7009/revocation.py file has changed over time:

$ git log -L :authenticate_token:authlib/oauth2/rfc7009/revocation.py

commit 3655d285d4062e9a3118a0c55884e8a36acb1b16
Author: Kartik Ohri <kartikohri13@gmail.com>
Date:   Tue Apr 23 13:46:44 2024 +0530

    rfc7009: return error if client validation fails
    
    [Section 2 of RFC 7009](https://datatracker.ietf.org/doc/html/rfc7009#section-2) says:
    
    "The authorization server first validates the client credentials (in
     case of a confidential client) and then verifies whether the token
     was issued to the client making the revocation request.  If this
     validation fails, the request is refused and the client is informed
     of the error by the authorization server as described below."
    
    Accordingly, update the code to return an invalid_grant error if the token being
    revoked does not belong to client credentials supplied.

diff --git a/authlib/oauth2/rfc7009/revocation.py b/authlib/oauth2/rfc7009/revocation.py
--- a/authlib/oauth2/rfc7009/revocation.py
+++ b/authlib/oauth2/rfc7009/revocation.py
@@ -18,17 +18,18 @@
     def authenticate_token(self, request, client):
         """The client constructs the request by including the following
         parameters using the "application/x-www-form-urlencoded" format in
         the HTTP request entity-body:
 
         token
             REQUIRED.  The token that the client wants to get revoked.
 
         token_type_hint
             OPTIONAL.  A hint about the type of the token submitted for
             revocation.
         """
         self.check_params(request, client)
         token = self.query_token(request.form['token'], request.form.get('token_type_hint'))
-        if token and token.check_client(client):
-            return token
+        if token and not token.check_client(client):
+            raise InvalidGrantError()
+        return token
 

commit d589d4ff513a90168118f7bdec00b2fcaac49f41
Author: Éloi Rivard <eloi@yaal.coop>
Date:   Sun Aug 27 14:49:05 2023 +0200

    feat: implement rfc9068 JWT Access Tokens

diff --git a/authlib/oauth2/rfc7009/revocation.py b/authlib/oauth2/rfc7009/revocation.py
--- a/authlib/oauth2/rfc7009/revocation.py
+++ b/authlib/oauth2/rfc7009/revocation.py
@@ -18,12 +18,17 @@
     def authenticate_token(self, request, client):
         """The client constructs the request by including the following
         parameters using the "application/x-www-form-urlencoded" format in
         the HTTP request entity-body:
 
         token
             REQUIRED.  The token that the client wants to get revoked.
 
         token_type_hint
             OPTIONAL.  A hint about the type of the token submitted for
             revocation.
         """
+        self.check_params(request, client)
+        token = self.query_token(request.form['token'], request.form.get('token_type_hint'))
+        if token and token.check_client(client):
+            return token
+

commit 2b37fbe6d8a0773a5e22a8f2104052ad27f65485
Author: Hsiaoming Yang <me@lepture.com>
Date:   Fri Nov 6 21:40:51 2020 +0900

    Refactor RevocationEndpoint and IntrospectionEndpoint

diff --git a/authlib/oauth2/rfc7009/revocation.py b/authlib/oauth2/rfc7009/revocation.py
--- a/authlib/oauth2/rfc7009/revocation.py
+++ b/authlib/oauth2/rfc7009/revocation.py
@@ -18,12 +18,12 @@
-    def authenticate_endpoint_credential(self, request, client):
+    def authenticate_token(self, request, client):
         """The client constructs the request by including the following
         parameters using the "application/x-www-form-urlencoded" format in
         the HTTP request entity-body:
 
         token
             REQUIRED.  The token that the client wants to get revoked.
 
         token_type_hint
             OPTIONAL.  A hint about the type of the token submitted for
             revocation.
         """

commit 0fa2d036660e6a8ca25acb9e0ca1bd21836fc766
Author: Hsiaoming Yang <me@lepture.com>
Date:   Wed Sep 11 20:34:48 2019 +0900

    Refactor authorization extra endpoints

diff --git a/authlib/oauth2/rfc7009/revocation.py b/authlib/oauth2/rfc7009/revocation.py
--- a/authlib/oauth2/rfc7009/revocation.py
+++ b/authlib/oauth2/rfc7009/revocation.py
@@ -18,12 +18,12 @@
-    def validate_endpoint_request(self):
+    def authenticate_endpoint_credential(self, request, client):
         """The client constructs the request by including the following
         parameters using the "application/x-www-form-urlencoded" format in
         the HTTP request entity-body:
 
         token
             REQUIRED.  The token that the client wants to get revoked.
 
         token_type_hint
             OPTIONAL.  A hint about the type of the token submitted for
             revocation.
         """

commit 8b535a8b09ebeaa9d9410e4c86e0371abc7d7bd6
Author: Hsiaoming Yang <me@lepture.com>
Date:   Wed Jan 2 22:29:19 2019 +0900

    Move oauth2 rfcs out of specs folder.

diff --git a/authlib/oauth2/rfc7009/revocation.py b/authlib/oauth2/rfc7009/revocation.py
--- /dev/null
+++ b/authlib/oauth2/rfc7009/revocation.py
@@ -0,0 +16,12 @@
+    def validate_endpoint_request(self):
+        """The client constructs the request by including the following
+        parameters using the "application/x-www-form-urlencoded" format in
+        the HTTP request entity-body:
+
+        token
+            REQUIRED.  The token that the client wants to get revoked.
+
+        token_type_hint
+            OPTIONAL.  A hint about the type of the token submitted for
+            revocation.
+        """

Output Analysis

Amazing! The diffs that are returned uniquely reference the function I was looking for; there's no additional noise that I have to sift through. I can immediately see that in April there was a change that causes an exception to be raised if there's a valid token but an invalid OAuth client. Culprit: found!

It also traced that the original name of the method was validate_endpoint_request, which was then renamed to authenticate_endpoint_credential, and finally to authenticate_token. Being able to follow renames within the same file is a huge plus, even though it wasn't relevant in this particular situation.

Additional Options

The -L flag also accepts regular expressions for more complex queries where a standard diff hunk header pattern isn't quite good enough, along with start/end line numbers and/or offsets. The docs explain the various permutations quite clearly (a rarity for git docs), and is worth a read.

Next time you reach for git blame to figure out how some method has changed over time, try the very easy-to-remember and perfectly well named git log -L :<funcname>:<path>!


1

This isn't the actual bug that I was dealing with, but is simple enough to follow along for the sake of exposition.

2

Weirdly enough there's no "name" for this flag that you can use as a mnemonic; it's just -L for hunk-based evolution, I guess?

3

You may need to do this for several other languages, too, e.g. golang, rust, scheme, etc.