<?xml version="1.0"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">

<channel>
	<title>Planet Python</title>
	<link>http://planetpython.org/</link>
	<language>en</language>
	<description>Planet Python - http://planetpython.org/</description>

<item>
	<title>Real Python: Quiz: Data Management With Python, SQLite, and SQLAlchemy</title>
	<guid>https://realpython.com/quizzes/python-sqlite-sqlalchemy/</guid>
	<link>https://realpython.com/quizzes/python-sqlite-sqlalchemy/</link>
	<description>&lt;p&gt;In this quiz, you&amp;rsquo;ll test your understanding of the tutorial &lt;a href=&quot;https://realpython.com/python-sqlite-sqlalchemy/&quot;&gt;Data Management With Python, SQLite, and SQLAlchemy&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;By working through this quiz, you&amp;rsquo;ll revisit how Python, SQLite, and SQLAlchemy work together to give your programs reliable data storage.&lt;/p&gt;
&lt;p&gt;You&amp;rsquo;ll also check your grasp of primary and foreign keys, SQL operations, and the SQLAlchemy models that let you work with your data as Python objects.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Mon, 04 May 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>The Python Coding Stack: Do You Get It Now?</title>
	<guid>https://www.thepythoncodingstack.com/p/do-you-get-it-now-getitem-getattr-getattribute-get</guid>
	<link>https://www.thepythoncodingstack.com/p/do-you-get-it-now-getitem-getattr-getattribute-get</link>
	<description>&lt;p&gt;When you decide to learn about Python&amp;#8217;s special methods, you have to choose which ones to learn first. Some are more straightforward than others. It makes sense to prioritise them.&lt;/p&gt;&lt;p&gt;However, there are some headaches and rabbit holes along the way.&lt;/p&gt;&lt;p&gt;And one of these challenges is when you start exploring the &amp;#8220;get*&amp;#8221; dunder methods. You come across &lt;code&gt;.__getitem__()&lt;/code&gt;, &lt;code&gt;.__getattr__()&lt;/code&gt;, &lt;code&gt;.__getattribute__()&lt;/code&gt;, and &lt;code&gt;.__get__()&lt;/code&gt; and you think:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;&amp;#8220;Aren&amp;#8217;t they all the same thing? They&amp;#8217;re all &amp;#8216;getting&amp;#8217; stuff, right?&amp;#8221;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Well, yes, they&amp;#8217;re all &amp;#8220;getting stuff&amp;#8221;, which is why they have &amp;#8220;get&amp;#8221; in their names. But, as you probably guessed by now, they do different things.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;One Deals With &lt;/strong&gt;&lt;code&gt;[]&lt;/code&gt;&lt;strong&gt; &amp;#8226; The Others Deal With &lt;/strong&gt;&lt;code&gt;.&lt;/code&gt;&lt;/h2&gt;&lt;p&gt;Let&amp;#8217;s start with the odd one out, which is also possibly the least challenging of the lot. The &lt;code&gt;.__getitem__()&lt;/code&gt; special method deals with the square bracket notation, &lt;code&gt;[]&lt;/code&gt;, which you place after an object&amp;#8217;s name. These are the square brackets you use to get an item from a list or a dictionary, say. You&amp;#8217;ll see later that the other special methods with &amp;#8220;get&amp;#8221; in their name deal with the dot, &lt;code&gt;.&lt;/code&gt;, which you use in a different context in Python.&lt;/p&gt;&lt;p&gt;With lists, or any other sequence, you use the square brackets with an index that represents the position of an item within the data structure. So, &lt;code&gt;numbers[0]&lt;/code&gt; gives you the value of the first item in &lt;code&gt;numbers&lt;/code&gt; if &lt;code&gt;numbers&lt;/code&gt; is a sequence.&lt;/p&gt;&lt;p&gt;With dictionaries, or mappings in general, you place the key inside the square brackets to fetch the value associated with that key, such as &lt;code&gt;points[&quot;Sam&quot;]&lt;/code&gt;, which gives the value associated with the key &lt;code&gt;&quot;Sam&quot;&lt;/code&gt; if it exists.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s explore a custom class:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!Jsas!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd238b6ff-1c70-43c9-9eb3-0ec3e0640304_1200x630.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!Jsas!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd238b6ff-1c70-43c9-9eb3-0ec3e0640304_1200x630.png&quot; width=&quot;1200&quot; height=&quot;630&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;All code blocks are available in text format at the end of this article &amp;#8226; #1&lt;/div&gt;&lt;p&gt;You pass a mapping to &lt;code&gt;PointsTable&lt;/code&gt; when creating an instance of the class. The data is stored as a dictionary in the &lt;code&gt;._data&lt;/code&gt; attribute.&lt;/p&gt;&lt;p&gt;Now, let&amp;#8217;s say you&amp;#8217;d like to access points from a &lt;code&gt;PointsTable&lt;/code&gt; instance using the square bracket notation, as you would do with a dictionary:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!UacJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ed7c264-0362-4b9e-8c92-5bfd4eafb883_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!UacJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ed7c264-0362-4b9e-8c92-5bfd4eafb883_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#2&lt;/div&gt;&lt;p&gt;Unfortunately, this doesn&amp;#8217;t work:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Traceback (most recent call last):
  ...
    print(table[&amp;#8221;Mark&amp;#8221;])
          ~~~~~^^^^^^^^
TypeError: &amp;#8216;PointsTable&amp;#8217; object is not subscriptable&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The &lt;code&gt;PointsTable&lt;/code&gt; instance is not a dictionary. It includes a dictionary as a data attribute. The &lt;code&gt;PointsTable&lt;/code&gt; object itself is not subscriptable, which means you can&amp;#8217;t use the square bracket notation to fetch an item.&lt;/p&gt;&lt;p&gt;When Python sees the square bracket notation after an object identifier, such as &lt;code&gt;table&lt;/code&gt;, it calls the class&amp;#8217;s &lt;code&gt;.__getitem__()&lt;/code&gt; special method. If it&amp;#8217;s not there, as in this case, Python raises a &lt;code&gt;TypeError&lt;/code&gt; saying the object is not subscriptable.&lt;/p&gt;&lt;p&gt;But what does this tell you? If you want to use the square brackets notation, you just need to define &lt;code&gt;.__getitem__()&lt;/code&gt; in the class:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!WK4y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0fdd839-3e4b-4cf7-a6b1-47e7e05195ef_1200x840.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!WK4y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0fdd839-3e4b-4cf7-a6b1-47e7e05195ef_1200x840.png&quot; width=&quot;1200&quot; height=&quot;840&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#3&lt;/div&gt;&lt;p&gt;Now, Python finds the &lt;code&gt;.__getitem__()&lt;/code&gt; special method. It takes whatever you placed within the square brackets and passes it as an argument to &lt;code&gt;.__getitem__()&lt;/code&gt;. Therefore, &lt;code&gt;table[&quot;Mark&quot;]&lt;/code&gt; returns whatever the call to &lt;code&gt;PointsTable.__getitem__(table, &quot;Mark&quot;)&lt;/code&gt; returns:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;17&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The code outputs Mark&amp;#8217;s points. The &lt;code&gt;PointsTable&lt;/code&gt; class is now subscriptable since it has a &lt;code&gt;.__getitem__()&lt;/code&gt; special method. The term &lt;em&gt;item&lt;/em&gt; generally refers to the objects contained within a data structure. Therefore, &lt;code&gt;__getitem__()&lt;/code&gt; is there to let you &lt;em&gt;get an item&lt;/em&gt; from within a data structure.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s make this example a bit more interesting before we move on to the other special methods with &amp;#8220;get&amp;#8221; in their name.&lt;/p&gt;&lt;p&gt;Try the following:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!tkur!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf45cd02-ccb8-49cd-b722-045b4f629ed7_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!tkur!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf45cd02-ccb8-49cd-b722-045b4f629ed7_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#4&lt;/div&gt;&lt;p&gt;This raises an error:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Traceback (most recent call last):
  File ..., line 18, in &amp;lt;module&amp;gt;
    print(table[&amp;#8221;Mark&amp;#8221;, &amp;#8220;Stephen&amp;#8221;])
          ~~~~~^^^^^^^^^^^^^^^^^^^
  File ..., line 6, in __getitem__
    return self._data[item]
           ~~~~~~~~~~^^^^^^
KeyError: (&amp;#8217;Mark&amp;#8217;, &amp;#8216;Stephen&amp;#8217;)&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There&amp;#8217;s no key equal to &lt;code&gt;(&quot;Mark&quot;, &quot;Stephen&quot;)&lt;/code&gt;. So, you get a &lt;code&gt;KeyError&lt;/code&gt;. Note how Python places parentheses around the two names when showing you the &lt;code&gt;KeyError&lt;/code&gt;, even though you didn&amp;#8217;t use the parentheses within the square brackets in &lt;code&gt;table[&quot;Mark&quot;, &quot;Stephen&quot;]&lt;/code&gt;. This gives you a clue to what&amp;#8217;s happening. But let&amp;#8217;s explore this so we&amp;#8217;re sure and we don&amp;#8217;t just rely on our hunches:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!eCzw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F976f6780-c9c5-4ba1-90fd-53ae281eae8d_1200x714.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!eCzw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F976f6780-c9c5-4ba1-90fd-53ae281eae8d_1200x714.png&quot; width=&quot;1200&quot; height=&quot;714&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#5&lt;/div&gt;&lt;p&gt;You add two &lt;code&gt;print()&lt;/code&gt; calls in &lt;code&gt;.__getitem__()&lt;/code&gt;. Here&amp;#8217;s what they print out before Python raises the &lt;code&gt;KeyError&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;item=(&amp;#8217;Mark&amp;#8217;, &amp;#8216;Stephen&amp;#8217;)
type(item)=&amp;lt;class &amp;#8216;tuple&amp;#8217;&amp;gt;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;The object Python passes to &lt;code&gt;.__getitem__()&lt;/code&gt; is the tuple &lt;code&gt;(&quot;Mark&quot;, &quot;Stephen&quot;)&lt;/code&gt;. You decide to make your &lt;code&gt;PointsTable&lt;/code&gt; class super-flexible so you can include more than one name within the square brackets notation:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!GN9V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3f5248c-f575-409f-9317-f1d096d3d981_1200x798.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!GN9V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3f5248c-f575-409f-9317-f1d096d3d981_1200x798.png&quot; width=&quot;1200&quot; height=&quot;798&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#6&lt;/div&gt;&lt;p&gt;The &lt;code&gt;.__getitem__()&lt;/code&gt; method now checks whether &lt;code&gt;item&lt;/code&gt; is a tuple first. If it is, then &lt;code&gt;.__getitem__()&lt;/code&gt; returns a tuple with all the values. If &lt;code&gt;item&lt;/code&gt; is not a tuple, then it must be just a single key, and the square brackets behave in the normal way:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;(17, 20)
(22, 20, 19)
17&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;But note that you can no longer use an actual tuple as a dictionary key. This is fine in this example since all the keys are strings with people&amp;#8217;s names.&lt;/p&gt;&lt;p&gt;So, &lt;code&gt;.__getitem__()&lt;/code&gt; gives you full control over what happens when you use the square brackets notation to fetch an item from an object. Now, let&amp;#8217;s move on to the other &amp;#8220;get*&amp;#8221; special methods.&lt;/p&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;p class=&quot;button-wrapper&quot;&gt;&lt;a class=&quot;button primary&quot; href=&quot;https://buy.stripe.com/00g3de2iGdgg4gg7su&quot;&gt;&lt;span&gt;Support The Python Coding Stack&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;h2&gt;&lt;strong&gt;Accessing Data Using The Dot Notation, &lt;/strong&gt;&lt;code&gt;.&lt;/code&gt;&lt;/h2&gt;&lt;p&gt;You decide you also want &lt;code&gt;PointsTable&lt;/code&gt; to work with the dot notation, so that you can use &lt;code&gt;table.Mark&lt;/code&gt; to get the number of points Mark has.&lt;/p&gt;&lt;p&gt;Note that I&amp;#8217;m adding more features to this class to demonstrate various special methods in this article. I&amp;#8217;m not suggesting it&amp;#8217;s necessarily always desirable to try to be too fancy with your classes!&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s try it out directly first, without making any changes to the class:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!nohu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc743712-9d55-4eac-9780-240c7c76a2db_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!nohu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc743712-9d55-4eac-9780-240c7c76a2db_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#7&lt;/div&gt;&lt;p&gt;Unfortunately, &lt;code&gt;table.Mark&lt;/code&gt; raises an error:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Traceback (most recent call last):
  File ..., line 32, in &amp;lt;module&amp;gt;
    print(table.Mark)
          ^^^^^^^^^^
AttributeError: &amp;#8216;PointsTable&amp;#8217; object has no attribute &amp;#8216;Mark&amp;#8217;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Python is looking for an attribute called &lt;code&gt;Mark&lt;/code&gt;. Attributes are the things you can access using the dot notation in Python. Typically, these are data attributes, methods, properties, and other things you can access using the dot. However, as with everything else, Python provides special methods to handle this. But this is where it gets a bit complicated.&lt;/p&gt;&lt;h3&gt;&lt;code&gt;.__getattr__()&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;Let&amp;#8217;s take this one step at a time. Let&amp;#8217;s add the &lt;code&gt;.__getattr__()&lt;/code&gt; special method. You can guess that this special method name stands for &lt;em&gt;get attribute&lt;/em&gt; (but it&amp;#8217;s not the only special method that stands for &lt;em&gt;get attribute&lt;/em&gt;!) Let&amp;#8217;s add this method:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!tgaB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd594fd71-9cb2-4a19-b1ec-a5cf54680da4_1200x1176.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!tgaB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd594fd71-9cb2-4a19-b1ec-a5cf54680da4_1200x1176.png&quot; width=&quot;1200&quot; height=&quot;1176&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#8&lt;/div&gt;&lt;p&gt;The &lt;code&gt;.__getattr__()&lt;/code&gt; special method has a &lt;code&gt;print()&lt;/code&gt; call to show when it&amp;#8217;s called. This line is not required, but it will help you figure out when Python calls this special method. It&amp;#8217;s a way of peeking into Python&amp;#8217;s inner workings! For completeness, I added a similar line to &lt;code&gt;.__getitem__()&lt;/code&gt; as well.&lt;/p&gt;&lt;p&gt;There are two &lt;code&gt;print()&lt;/code&gt; calls at the bottom: one uses the square brackets notation, which you dealt with in the previous section, and the other uses the dot notation. Now, both work:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getitem__ with argument item=&amp;#8217;Mark&amp;#8217;
17
Calling __getattr__ with argument name=&amp;#8217;Mark&amp;#8217;
17&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you&amp;#8217;ve seen in the previous section, the square brackets notation relies on &lt;code&gt;.__getitem__()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;And as the third and fourth lines in this printout show, Python used &lt;code&gt;.__getattr__()&lt;/code&gt; to deal with &lt;code&gt;table.Mark&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Python passed the attribute name to the &lt;code&gt;.__getattr__()&lt;/code&gt; special method&amp;#8217;s &lt;code&gt;name&lt;/code&gt; parameter. This method then fetches the value associated with this name from the dictionary &lt;code&gt;._data&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;This trick only works for keys that are valid attribute names and that don&amp;#8217;t clash with real attributes or methods. That&amp;#8217;s one reason dictionary-style access is often preferable for arbitrary user data. But let&amp;#8217;s stick with this exercise in this article.&lt;/p&gt;&lt;p&gt;Also, you wrote the code such that if &lt;code&gt;self._data[name]&lt;/code&gt; raises a &lt;code&gt;KeyError&lt;/code&gt;, then &lt;code&gt;.__getattr__()&lt;/code&gt; raises an &lt;code&gt;AttributeError&lt;/code&gt;, which is what you&amp;#8217;d expect if you try to use a name that&amp;#8217;s not an attribute after the dot:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!UFQW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe07165cb-89fe-48a3-9200-e21d9bc291ae_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!UFQW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe07165cb-89fe-48a3-9200-e21d9bc291ae_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#9&lt;/div&gt;&lt;p&gt;This raises an &lt;code&gt;AttributeError&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;AttributeError: &amp;#8216;PointsTable&amp;#8217; object has no attribute &amp;#8216;Matilda&amp;#8217;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;It seems that &lt;code&gt;.__getattr__()&lt;/code&gt; is analogous to &lt;code&gt;.__getitem__()&lt;/code&gt; &amp;#8211; the first deals with &lt;code&gt;.&lt;/code&gt; and the second with &lt;code&gt;[]&lt;/code&gt;. But...&lt;/p&gt;&lt;p&gt;But things are a bit more complex. What if you try to access a standard attribute, say a data attribute or a method? You haven&amp;#8217;t defined any methods in this class, but you do have a data attribute, &lt;code&gt;._data&lt;/code&gt;. It&amp;#8217;s marked with a leading underscore, showing you shouldn&amp;#8217;t access it directly. But you can &amp;#8220;break the rules&amp;#8221;. Python won&amp;#8217;t stop you:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!hBj-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2dc4d9-5285-4e91-aba6-09f5d5b915f0_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!hBj-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2dc4d9-5285-4e91-aba6-09f5d5b915f0_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#10&lt;/div&gt;&lt;p&gt;The output is the following:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;{&amp;#8217;Stephen&amp;#8217;: 20, &amp;#8216;Mark&amp;#8217;: 17, &amp;#8216;Kate&amp;#8217;: 19, &amp;#8216;Sarah&amp;#8217;: 22}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Perhaps, this is not surprising. But note that there&amp;#8217;s no printout that says that &lt;code&gt;.__getattr__()&lt;/code&gt; was called, as you got when you used &lt;code&gt;table.Mark&lt;/code&gt;. Let&amp;#8217;s look at both outputs together:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!7mTC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27b2ee66-3944-473f-a6b1-579871c9536b_1200x252.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!7mTC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27b2ee66-3944-473f-a6b1-579871c9536b_1200x252.png&quot; width=&quot;1200&quot; height=&quot;252&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#11&lt;/div&gt;&lt;p&gt;The output now is the following:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattr__ with argument name=&amp;#8217;Mark&amp;#8217;
17
{&amp;#8217;Stephen&amp;#8217;: 20, &amp;#8216;Mark&amp;#8217;: 17, &amp;#8216;Kate&amp;#8217;: 19, &amp;#8216;Sarah&amp;#8217;: 22}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Python called the &lt;code&gt;.__getattr__()&lt;/code&gt; special method when you accessed &lt;code&gt;.Mark&lt;/code&gt;, but not when you accessed &lt;code&gt;._data&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;This means there&amp;#8217;s something else happening behind the scenes.&lt;/p&gt;&lt;h3&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;This is where &lt;code&gt;.__getattribute__()&lt;/code&gt; enters the scene. No prizes for guessing that this method name stands for &lt;em&gt;get attribute&lt;/em&gt;, but then so did &lt;code&gt;.__getattr__()&lt;/code&gt;. This is confusing.&lt;/p&gt;&lt;p&gt;And here&amp;#8217;s the thing. It&amp;#8217;s &lt;code&gt;.__getattribute__()&lt;/code&gt; that gets called each and every time you use the dot notation with an instance. Let&amp;#8217;s define it:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!DwOO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84a1462-4308-4235-adc5-6dea98293219_1236x924.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!DwOO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd84a1462-4308-4235-adc5-6dea98293219_1236x924.png&quot; width=&quot;1236&quot; height=&quot;924&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#12&lt;/div&gt;&lt;p&gt;Note that this code breaks lots of things as it is now. We&amp;#8217;ll fix it soon. Let&amp;#8217;s first look at the output from this code:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
None
Calling __getattribute__ with argument name=&amp;#8217;_data&amp;#8217;
None&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;As you can see, Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt; for both &lt;code&gt;table.Mark&lt;/code&gt; and &lt;code&gt;table._data&lt;/code&gt;. However, the current &lt;code&gt;.__getattribute__()&lt;/code&gt; method does nothing else. Therefore, it returns &lt;code&gt;None&lt;/code&gt;. You&amp;#8217;ll see soon that this is a bigger problem than you might think.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s try using the dot notation to access an attribute that doesn&amp;#8217;t exist:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!ORHL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58038f2d-a8f0-4b1c-81ac-1fc5067d4b35_1200x294.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!ORHL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58038f2d-a8f0-4b1c-81ac-1fc5067d4b35_1200x294.png&quot; width=&quot;1200&quot; height=&quot;294&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#13&lt;/div&gt;&lt;p&gt;Every time you use the dot, &lt;code&gt;.&lt;/code&gt;, Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt;, which, for now, does nothing except print out the self-serving statement to show it was called:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
None
Calling __getattribute__ with argument name=&amp;#8217;_data&amp;#8217;
None
Calling __getattribute__ with argument name=&amp;#8217;Janine&amp;#8217;
None&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Therefore, you can&amp;#8217;t access any attribute at the moment. All data attributes and methods are out of reach. You should be careful when overriding &lt;code&gt;.__getattribute__()&lt;/code&gt;. In fact, you&amp;#8217;ll rarely need to write your own &lt;code&gt;.__getattribute__()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s fix this code by calling &lt;code&gt;object.__getattribute__()&lt;/code&gt; from within &lt;code&gt;PointsTable.__getattribute__()&lt;/code&gt;. Recall that all Python classes inherit from &lt;code&gt;object&lt;/code&gt;. And &lt;code&gt;object.__getattribute__()&lt;/code&gt; is where the important logic that Python uses to decide what to do when you use the &lt;code&gt;.&lt;/code&gt; lives. You can access &lt;code&gt;object&lt;/code&gt; using &lt;code&gt;super()&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!-ER0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d686747-479b-4efa-8c4c-3b77b22bad98_1236x924.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!-ER0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d686747-479b-4efa-8c4c-3b77b22bad98_1236x924.png&quot; width=&quot;1236&quot; height=&quot;924&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#14&lt;/div&gt;&lt;p&gt;Note that in this class, &lt;code&gt;.__getattribute__()&lt;/code&gt; is not doing anything different to the default version except for printing out a statement. I&amp;#8217;m using this for demonstration purposes only.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s deal with one &lt;code&gt;print()&lt;/code&gt; call at a time. The code above includes &lt;code&gt;print(table._data)&lt;/code&gt;. You&amp;#8217;re fetching the value of a data attribute:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;_data&amp;#8217;
{&amp;#8217;Stephen&amp;#8217;: 20, &amp;#8216;Mark&amp;#8217;: 17, &amp;#8216;Kate&amp;#8217;: 19, &amp;#8216;Sarah&amp;#8217;: 22}&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Python calls &lt;code&gt;PointsTable.__getattribute__()&lt;/code&gt;, which in turn calls the base class&amp;#8217;s &lt;code&gt;object.__getattribute__()&lt;/code&gt;. This special method recognises that &lt;code&gt;._data&lt;/code&gt; is an instance variable and fetches its value from the object&amp;#8217;s &lt;code&gt;.__dict__&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;But let&amp;#8217;s see what happens with the &lt;code&gt;.Mark&lt;/code&gt; attribute access. Recall that &lt;code&gt;.Mark&lt;/code&gt; is not a data attribute or method:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!YSbV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7369d87f-c3ac-4252-85b8-665a45655cf0_1200x210.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!YSbV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7369d87f-c3ac-4252-85b8-665a45655cf0_1200x210.png&quot; width=&quot;1200&quot; height=&quot;210&quot; src=&quot;src&quot; /&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#15&lt;/div&gt;&lt;p&gt;Here&amp;#8217;s the output now. Look at the various printouts, too:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
Calling __getattr__ with argument name=&amp;#8217;Mark&amp;#8217;
Calling __getattribute__ with argument name=&amp;#8217;_data&amp;#8217;
17&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;When Python sees &lt;code&gt;table.Mark&lt;/code&gt;, it calls &lt;code&gt;PointsTable.__getattribute__(table, &quot;Mark&quot;)&lt;/code&gt;. This call gives rise to the first line printed out above. The base class&amp;#8217;s &lt;code&gt;.__getattribute__()&lt;/code&gt; doesn&amp;#8217;t recognise &lt;code&gt;&quot;Mark&quot;&lt;/code&gt; as a data attribute, a method name, or anything else it&amp;#8217;s expecting (such as descriptors, which we will discuss soon). So, &lt;code&gt;object.__getattribute__()&lt;/code&gt; actually raises an &lt;code&gt;AttributeError&lt;/code&gt; in this case, which you can&amp;#8217;t see in the output above.&lt;/p&gt;&lt;p&gt;That&amp;#8217;s because when &lt;code&gt;.__getattribute__()&lt;/code&gt; raises an &lt;code&gt;AttributeError&lt;/code&gt; during normal attribute access using the dot, Python checks whether there&amp;#8217;s a &lt;code&gt;.__getattr__()&lt;/code&gt; defined. If there is, it uses it as a fallback. That&amp;#8217;s what happens in this case. You can see that &lt;code&gt;.__getattr__()&lt;/code&gt; is called next &amp;#8211; that&amp;#8217;s the second line printed out above.&lt;/p&gt;&lt;p&gt;But &lt;code&gt;.__getattr__()&lt;/code&gt; contains this line: &lt;code&gt;return self._data[name]&lt;/code&gt;. There&amp;#8217;s a dot in that line, too. So Python needs to access &lt;code&gt;.__getattribute__()&lt;/code&gt; again, this time using &lt;code&gt;&quot;_data&quot;&lt;/code&gt; as the argument. Since &lt;code&gt;._data&lt;/code&gt; is a data attribute, &lt;code&gt;object.__getattribute__()&lt;/code&gt; doesn&amp;#8217;t need to use &lt;code&gt;.__getattr__()&lt;/code&gt; as a fallback since it knows how to deal with data attributes.&lt;/p&gt;&lt;p&gt;Confused? You&amp;#8217;re not alone. This is confusing, I know. And there&amp;#8217;s a bit more. But we&amp;#8217;ll make sure things are clear by the end of this article.&lt;/p&gt;&lt;p&gt;But first, let&amp;#8217;s focus on &lt;code&gt;table.Mark&lt;/code&gt; again and let&amp;#8217;s comment out, just for now, the definition of &lt;code&gt;.__getattr__()&lt;/code&gt;:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!MknK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbedfef1d-4c6b-43a6-b630-400bd8b8c16b_1236x1218.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!MknK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbedfef1d-4c6b-43a6-b630-400bd8b8c16b_1236x1218.png&quot; width=&quot;1236&quot; height=&quot;1218&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#16&lt;/div&gt;&lt;p&gt;The base class&amp;#8217;s &lt;code&gt;.__getattribute__()&lt;/code&gt; method works hard to determine whether &lt;code&gt;&quot;Mark&quot;&lt;/code&gt; is an attribute it expects. It doesn&amp;#8217;t find it, and now, there&amp;#8217;s no fallback &lt;code&gt;.__getattr__()&lt;/code&gt;, so Python raises an error:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
Calling __getattribute__ with argument name=&amp;#8217;__dict__&amp;#8217;
Calling __getattribute__ with argument name=&amp;#8217;__class__&amp;#8217;
Traceback (most recent call last):
  ...
AttributeError: &amp;#8216;PointsTable&amp;#8217; object has no attribute &amp;#8216;Mark&amp;#8217;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You can see a couple of extra printouts, too. These are side effects of Python preparing the error message since we get the printout each time Python uses the dot notation, even when it does so behind the scenes. Let&amp;#8217;s not go too far down the rabbit hole in this article. We may never come out again if we go too deep!&lt;/p&gt;&lt;p&gt;You&amp;#8217;ve seen that when you use the dot notation, Python calls &lt;code&gt;object.__getattribute__()&lt;/code&gt; eventually. That&amp;#8217;s why you should include &lt;code&gt;super().__getattribute__()&lt;/code&gt; when you override this special method, unless you have a clear (niche) reason why you don&amp;#8217;t want to do this. This special method looks in a number of places for the attribute. I&amp;#8217;m going to skip a few steps in the hierarchy for now. But we&amp;#8217;ll revisit these in the next section (remember, we&amp;#8217;re not done yet since there&amp;#8217;s still &lt;code&gt;.__get__()&lt;/code&gt; to deal with).&lt;/p&gt;&lt;p&gt;The &lt;code&gt;.__getattribute__()&lt;/code&gt; special method looks for instance attributes. Let&amp;#8217;s temporarily turn &lt;code&gt;.Mark&lt;/code&gt; into an instance attribute:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!coz6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06fa5c82-d0cd-4a53-8d76-cc9c486429ed_1236x966.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!coz6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06fa5c82-d0cd-4a53-8d76-cc9c486429ed_1236x966.png&quot; width=&quot;1236&quot; height=&quot;966&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#17&lt;/div&gt;&lt;p&gt;Now, &lt;code&gt;.Mark&lt;/code&gt; is a data attribute, so &lt;code&gt;.__getattribute__()&lt;/code&gt; finds it:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
This is Mark as a data attribute&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;If it&amp;#8217;s not an instance variable, &lt;code&gt;object.__getattribute__()&lt;/code&gt; also checks whether it&amp;#8217;s a class attribute. Let&amp;#8217;s test this:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!lD3k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92086ec8-a13c-470b-92aa-a9c986526ea9_1236x1050.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!lD3k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F92086ec8-a13c-470b-92aa-a9c986526ea9_1236x1050.png&quot; width=&quot;1236&quot; height=&quot;1050&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#18&lt;/div&gt;&lt;p&gt;Here&amp;#8217;s the output:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling __getattribute__ with argument name=&amp;#8217;Mark&amp;#8217;
This is Mark as a class attribute&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Note that the line in &lt;code&gt;.__init__()&lt;/code&gt; that defines &lt;code&gt;.Mark&lt;/code&gt; as a data attribute is commented out now because instance attributes take precedence over class attributes. Only when all else fails (including the steps I skipped for now), does Python check whether the fallback special method &lt;code&gt;.__getattr__()&lt;/code&gt; is there.&lt;/p&gt;&lt;p&gt;So, when you use the dot notation for attribute access (the simplified version, for now):&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;It checks for known attributes, such as instance attributes and class attributes (and a bit more &amp;#8211; coming soon)&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;If it doesn&amp;#8217;t find anything, then &lt;code&gt;.__getattribute__()&lt;/code&gt; raises an &lt;code&gt;AttributeError&lt;/code&gt;. However, Python does one final check before letting this error through to the user: is there a &lt;code&gt;.__getattr__()&lt;/code&gt;? If there is, Python calls it to see whether it contains instructions for handling this unknown attribute.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;I&amp;#8217;m simplifying slightly here. Descriptors complicate the order, and we&amp;#8217;ll return to them in the next section.&lt;/p&gt;&lt;p&gt;Note that in real code, you rarely need to define &lt;code&gt;.__getattribute__()&lt;/code&gt;. You can rely on the default provided in the &lt;code&gt;object&lt;/code&gt; base class in most cases. Most custom attribute behaviour should use &lt;code&gt;.__getattr__()&lt;/code&gt; rather than &lt;code&gt;.__getattribute__()&lt;/code&gt;. Use &lt;code&gt;.__getattribute__()&lt;/code&gt; only when you need to intercept every attribute access. There is rarely a need for this, though.&lt;/p&gt;&lt;p&gt;Common uses for &lt;code&gt;.__getattr__()&lt;/code&gt; include delegating missing attributes to another object, implementing lazy loading, or exposing dynamic attributes from structured data.&lt;/p&gt;&lt;p&gt;Here&amp;#8217;s the tidied-up code in full so far:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!-J5g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb191478-cb58-48dc-a1cd-9755b1dba0de_1236x1596.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!-J5g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb191478-cb58-48dc-a1cd-9755b1dba0de_1236x1596.png&quot; width=&quot;1236&quot; height=&quot;1596&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#19&lt;/div&gt;&lt;h3&gt;&lt;strong&gt;The Asymmetry Between &lt;/strong&gt;&lt;code&gt;.__getattr__()&lt;/code&gt;&lt;strong&gt; and &lt;/strong&gt;&lt;code&gt;.__setattr__()&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;A short note: There&amp;#8217;s an asymmetry between &lt;code&gt;.__getattr__()&lt;/code&gt; and &lt;code&gt;.__setattr__()&lt;/code&gt; despite their names following the same pattern. As you&amp;#8217;ve seen above, &lt;code&gt;.__getattr__()&lt;/code&gt; is only used as a fallback when &lt;code&gt;.__getattribute__()&lt;/code&gt; doesn&amp;#8217;t find the attribute through &amp;#8220;normal routes&amp;#8221;.&lt;/p&gt;&lt;p&gt;However, &lt;code&gt;.__setattr__()&lt;/code&gt; doesn&amp;#8217;t have a counterpart analogous to &lt;code&gt;.__getattribute__()&lt;/code&gt;. Therefore, &lt;code&gt;.__setattr__()&lt;/code&gt; is always used when setting the value of an attribute. Life is simpler in the &amp;#8220;setting&amp;#8221; world!&lt;/p&gt;&lt;p&gt;In this article, I&amp;#8217;m focusing on the &amp;#8220;getting&amp;#8221; part of things, but some of the same logic applies to &amp;#8220;setting&amp;#8221;, too.&lt;/p&gt;&lt;div class=&quot;pullquote&quot;&gt;&lt;p&gt;&lt;em&gt;All &lt;a href=&quot;https://thepythoncodingplace.com?utm_source=the-python-coding-stack&quot;&gt;The Python Coding Place&lt;/a&gt; video courses are included into a single, cost-effective bundle. The courses cover beginner and intermediate level courses, and you also get access to a members-only forum.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;a href=&quot;https://thepythoncodingplace.thinkific.com/enroll/2731141&quot;&gt;Get All The Python Coding Place Courses in One Bundle&lt;/a&gt;&lt;/p&gt;&lt;/div&gt;&lt;h2&gt;&lt;strong&gt;The Fourth Horseman: &lt;/strong&gt;&lt;code&gt;.__get__()&lt;/code&gt;&lt;/h2&gt;&lt;p&gt;I&amp;#8217;ve had this article in my pipeline here on &lt;em&gt;The Python Coding Stack&lt;/em&gt; for a very long time. But I never wrote it because I knew that dealing with &lt;code&gt;.__get__()&lt;/code&gt; would be a pain and would need a lot of time and space. However, in March, I published &lt;a href=&quot;https://www.thepythoncodingstack.com/p/python-descriptors-the-price-is-right&quot;&gt;The Weird and Wonderful World of Descriptors in Python&lt;/a&gt;. If you haven&amp;#8217;t read that article yet, well, now is the time!&lt;/p&gt;&lt;p&gt;And therefore I can cheat in this article. I can avoid talking about &lt;code&gt;.__get__()&lt;/code&gt; in detail. What follows in this section is a summary of the world of descriptors. See the full article for more details.&lt;/p&gt;&lt;p&gt;If a class has &lt;code&gt;.__get__()&lt;/code&gt;, it&amp;#8217;s a descriptor class. If it only has &lt;code&gt;.__get__()&lt;/code&gt; and it does not define &lt;code&gt;.__set__()&lt;/code&gt; or &lt;code&gt;.__delete__()&lt;/code&gt;, which are the other methods that make up the descriptor protocol, then the class creates &lt;em&gt;non-data descriptors&lt;/em&gt;. Classes that include &lt;code&gt;.__set__()&lt;/code&gt; or &lt;code&gt;.__delete__()&lt;/code&gt; create &lt;em&gt;data descriptors&lt;/em&gt;. This distinction matters when we get back to &lt;code&gt;.__getattribute__()&lt;/code&gt; and the order it uses to look for known attributes.&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s explore this with a dummy example. This code defines two descriptor classes and another class to test the priority order that &lt;code&gt;object.__getattribute__()&lt;/code&gt; uses for different attribute types:&lt;/p&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!UR6A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F597a1293-508e-4371-9740-e538e926372f_1362x1890.png&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!UR6A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F597a1293-508e-4371-9740-e538e926372f_1362x1890.png&quot; width=&quot;1362&quot; height=&quot;1890&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;#20&lt;/div&gt;&lt;p&gt;&lt;code&gt;DataDescriptor&lt;/code&gt; defines &lt;code&gt;.__get__()&lt;/code&gt; and &lt;code&gt;.__set__()&lt;/code&gt; methods, which is why it creates data descriptors. However, &lt;code&gt;NonDataDescriptor&lt;/code&gt; creates non-data descriptors since the class only defines &lt;code&gt;.__get__()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;All the special methods in all classes (except &lt;code&gt;.__init__()&lt;/code&gt;) have &lt;code&gt;print()&lt;/code&gt; calls so you can see when Python calls them. Before we look at the output from this code, let&amp;#8217;s review the five attributes you use in the &lt;code&gt;print()&lt;/code&gt; calls and where they appear in the class definitions:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;.first&lt;/code&gt; is defined as a data descriptor at the top of the &lt;code&gt;TestingAttributeAccess&lt;/code&gt; class. Recall from the article on descriptors that you initially define descriptors as class attributes within a class. You also assign a value to &lt;code&gt;.first&lt;/code&gt; within the class&amp;#8217;s &lt;code&gt;.__init__()&lt;/code&gt;. More on this soon.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;.second&lt;/code&gt; is also defined as a descriptor at the top of the &lt;code&gt;TestingAttributeAccess&lt;/code&gt; class, but it&amp;#8217;s a non-data descriptor since the &lt;code&gt;NonDataDescriptor&lt;/code&gt; class only defines &lt;code&gt;.__get__()&lt;/code&gt;. You also assign a value to &lt;code&gt;self.second&lt;/code&gt; within &lt;code&gt;.__init__()&lt;/code&gt;. We&amp;#8217;ll see how the class actually deals with &lt;code&gt;.second&lt;/code&gt; soon, since it treats it differently from &lt;code&gt;.first&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;.third&lt;/code&gt; is also defined as a non-data descriptor. However, there is no other assignment to &lt;code&gt;third&lt;/code&gt; within &lt;code&gt;.__init__()&lt;/code&gt;. This is the key difference between &lt;code&gt;.second&lt;/code&gt; and &lt;code&gt;.third&lt;/code&gt; in this code.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;.fourth&lt;/code&gt; is a class attribute. It&amp;#8217;s not a descriptor and it&amp;#8217;s not an instance attribute.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;code&gt;.fifth&lt;/code&gt; is not defined anywhere in the class, but you still call &lt;code&gt;print(test.fifth)&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;Let&amp;#8217;s look at the whole output from this code and then break it down into steps later:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling DataDescriptor.__set__ with value=&amp;#8221;&amp;#8217;first&amp;#8217;: a data attribute defined in .__init__&amp;#8221;

Printing `test.first`
Calling __getattribute__ with argument name=&amp;#8217;first&amp;#8217;
Calling DataDescriptor.__get__
This is the Data Descriptor

Printing `test.second`
Calling __getattribute__ with argument name=&amp;#8217;second&amp;#8217;
&amp;#8216;second&amp;#8217;: a data attribute defined in .__init__

Printing `test.third`
Calling __getattribute__ with argument name=&amp;#8217;third&amp;#8217;
Calling NonDataDescriptor.__get__
This is the Non-Data Descriptor

Printing `test.fourth`
Calling __getattribute__ with argument name=&amp;#8217;fourth&amp;#8217;
I&amp;#8217;m just a normal class attribute!

Printing `test.fifth`
Calling __getattribute__ with argument name=&amp;#8217;fifth&amp;#8217;
Calling __getattr__ with argument name=&amp;#8217;fifth&amp;#8217;
This is the fallback value from __getattr__&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There&amp;#8217;s plenty to digest there. So, let&amp;#8217;s break it down in stages.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;0. Creating an instance of &lt;/strong&gt;&lt;code&gt;TestingAttributeAccess&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;I want to focus on the &amp;#8220;getting&amp;#8221; bit from the various &lt;code&gt;print()&lt;/code&gt; calls. However, there&amp;#8217;s this line output first, so let&amp;#8217;s deal with it, too:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Calling DataDescriptor.__set__ with value=&amp;#8221;&amp;#8217;first&amp;#8217;: a data attribute defined in .__init__&amp;#8221;&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This happens when you create the instance of &lt;code&gt;TestingAttributeAccess&lt;/code&gt; and assign it to the variable name &lt;code&gt;test&lt;/code&gt;. Why? Because &lt;code&gt;first&lt;/code&gt; has already been defined as a data descriptor at the time of the class definition. Therefore, when you assign a value to &lt;code&gt;.first&lt;/code&gt; during initialisation, the descriptor protocol kicks in.&lt;/p&gt;&lt;p&gt;When you call &lt;code&gt;TestingAttributeAccess()&lt;/code&gt;, Python calls the class&amp;#8217;s &lt;code&gt;.__init__()&lt;/code&gt; and soon finds this expression: &lt;code&gt;self.first = ...&lt;/code&gt;&lt;/p&gt;&lt;p&gt;Therefore, Python calls &lt;code&gt;DataDescriptor.__set__()&lt;/code&gt; to set the value of this attribute. Note that &lt;code&gt;DataDescriptor.__set__()&lt;/code&gt; doesn&amp;#8217;t really set anything in this case. Normally, something more meaningful would happen in &lt;code&gt;.__set__()&lt;/code&gt;. See &lt;a href=&quot;https://www.thepythoncodingstack.com/p/python-descriptors-the-price-is-right&quot;&gt;The Weird and Wonderful World of Descriptors in Python&lt;/a&gt; for more on this.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;1. &lt;/strong&gt;&lt;code&gt;test.first&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;The next section in the code&amp;#8217;s output is the following:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Printing `test.first`
Calling __getattribute__ with argument name=&amp;#8217;first&amp;#8217;
Calling DataDescriptor.__get__
This is the Data Descriptor&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;This output is created when you write &lt;code&gt;print(test.first)&lt;/code&gt;. The dot notation triggers Python to call &lt;code&gt;.__getattribute__()&lt;/code&gt;. So, let&amp;#8217;s start exploring the hierarchy of checks in &lt;code&gt;.__getattribute__()&lt;/code&gt;. What does this special method look for first?&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The &lt;/strong&gt;&lt;em&gt;&lt;strong&gt;first&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt; thing &lt;/strong&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;strong&gt; checks is whether the attribute is a data descriptor.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Since &lt;code&gt;.first&lt;/code&gt; is a data descriptor, the search stops there. Python calls &lt;code&gt;DataDescriptor.__get__()&lt;/code&gt; and returns whatever the &lt;code&gt;.__get__()&lt;/code&gt; method returns. This is the string &lt;code&gt;&quot;This is the Data Descriptor&quot;&lt;/code&gt; in this demo example.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;2. &lt;/strong&gt;&lt;code&gt;test.second&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;Let&amp;#8217;s look at the next segment in the code&amp;#8217;s output. This is generated when you call &lt;code&gt;print(test.second)&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Printing `test.second`
Calling __getattribute__ with argument name=&amp;#8217;second&amp;#8217;
&amp;#8216;second&amp;#8217;: a data attribute defined in .__init__&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;You know the drill by now. The dot notation is the trigger that makes Python call &lt;code&gt;.__getattribute__()&lt;/code&gt;. It checks whether the attribute is a data descriptor. But &lt;code&gt;.second&lt;/code&gt; is not a data descriptor. You initially define it as a &lt;em&gt;non-data&lt;/em&gt; descriptor. Since it&amp;#8217;s definitely not a data descriptor, it&amp;#8217;s time to move on to the second check in the hierarchy.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The &lt;/strong&gt;&lt;em&gt;&lt;strong&gt;second&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt; thing &lt;/strong&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;strong&gt; checks is whether the attribute is an instance attribute.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Is this attribute name in the object&amp;#8217;s &lt;code&gt;.__dict__&lt;/code&gt; dictionary? Instance attributes come second in the hierarchy. Now, you&amp;#8217;ve seen that &lt;code&gt;.second&lt;/code&gt; is not a data descriptor, which is why we moved to the second check. But is &lt;code&gt;test.second&lt;/code&gt; an instance attribute or a non-data descriptor?&lt;/p&gt;&lt;p&gt;You can see from the printout that Python never called &lt;code&gt;NonDataDescriptor.__get__()&lt;/code&gt; even though you originally defined &lt;code&gt;.second&lt;/code&gt; as a &lt;code&gt;NonDataDescriptor&lt;/code&gt; object. The swap occurred when you created the &lt;code&gt;TestingAttributeAccess&lt;/code&gt; instance. Python calls the class&amp;#8217;s &lt;code&gt;.__init__()&lt;/code&gt;, which contains this assignment line: &lt;code&gt;self.second = ...&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Since &lt;code&gt;.second&lt;/code&gt; is not a data descriptor (it&amp;#8217;s a non-data descriptor) and doesn&amp;#8217;t have a &lt;code&gt;.__set__()&lt;/code&gt;, the descriptor protocol doesn&amp;#8217;t kick in here to set the value of this attribute. This is different from &lt;code&gt;.first&lt;/code&gt;, which was a data descriptor and, therefore, its &lt;code&gt;.__set__()&lt;/code&gt; was responsible for setting the value.&lt;/p&gt;&lt;p&gt;Since the &lt;code&gt;.second&lt;/code&gt; non-data descriptor doesn&amp;#8217;t have a &lt;code&gt;.__set__()&lt;/code&gt;, Python does what it always does in these cases: it assigns the new object, the string &lt;code&gt;&quot;'second': a data attribute defined in .__init__&quot;&lt;/code&gt;, to &lt;code&gt;test.second&lt;/code&gt;. Therefore, &lt;code&gt;test.second&lt;/code&gt; is now an instance attribute containing the string rather than the original non-data descriptor.&lt;/p&gt;&lt;p&gt;Although &lt;code&gt;test.second&lt;/code&gt; would originally access the non-data descriptor, you created an instance attribute when you initialised the object. It&amp;#8217;s now a data attribute, which is an instance attribute.&lt;/p&gt;&lt;p&gt;Note that &lt;code&gt;TestingAttributeAccess.second&lt;/code&gt; is still there as a class attribute, and that&amp;#8217;s still the non-data descriptor. But &lt;code&gt;test.second&lt;/code&gt; now points to another object, the string.&lt;/p&gt;&lt;p&gt;Therefore, when later in the code you call &lt;code&gt;print(test.second)&lt;/code&gt;, Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt; and starts looking through its hierarchy. It&amp;#8217;s not a data descriptor. It is an instance attribute. Therefore, it returns its value.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;3. &lt;/strong&gt;&lt;code&gt;test.third&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;The next block of output is the one generated when you call &lt;code&gt;print(test.third)&lt;/code&gt;. Recall that &lt;code&gt;.third&lt;/code&gt; is a &lt;code&gt;NonDataDescriptor&lt;/code&gt; object. So was &lt;code&gt;.second&lt;/code&gt; initially. However, unlike &lt;code&gt;.second&lt;/code&gt;, Python never creates an instance attribute that&amp;#8217;s assigned to &lt;code&gt;test.third&lt;/code&gt; since you don&amp;#8217;t assign to &lt;code&gt;self.third&lt;/code&gt; in &lt;code&gt;.__init__()&lt;/code&gt;. Here&amp;#8217;s the output:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Printing `test.third`
Calling __getattribute__ with argument name=&amp;#8217;third&amp;#8217;
Calling NonDataDescriptor.__get__
This is the Non-Data Descriptor&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;In &lt;code&gt;print(test.third)&lt;/code&gt;, there&amp;#8217;s a dot again, so Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt;. This method first checks whether &lt;code&gt;.third&lt;/code&gt; is a data descriptor. It is not. Then it checks whether &lt;code&gt;.third&lt;/code&gt; is an instance attribute. It is not. So...&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The &lt;/strong&gt;&lt;em&gt;&lt;strong&gt;third&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt; thing &lt;/strong&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;strong&gt; checks is whether the attribute is a non-data descriptor.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Non-data descriptors come next in the hierarchy. That&amp;#8217;s why you can see the printout saying that Python calls &lt;code&gt;NonDataDescriptor.__get__&lt;/code&gt; and then prints the attribute&amp;#8217;s value, which is the value it gets from the non-data descriptor&amp;#8217;s &lt;code&gt;.__get__()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Incidentally, ordinary instance methods are functions that also implement the &lt;code&gt;.__get__()&lt;/code&gt; special method. They are non-data descriptors. So, methods are found in this third check in the hierarchy of checks when you use the dot notation.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;4. &lt;/strong&gt;&lt;code&gt;test.fourth&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;It&amp;#8217;s &lt;code&gt;test.fourth&lt;/code&gt;&amp;#8216;s turn. Here&amp;#8217;s the output you get from &lt;code&gt;print(test.fourth)&lt;/code&gt;:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Printing `test.fourth`
Calling __getattribute__ with argument name=&amp;#8217;fourth&amp;#8217;
I&amp;#8217;m just a normal class attribute!&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;There&amp;#8217;s a dot, so there&amp;#8217;s a call to &lt;code&gt;.__getattribute__()&lt;/code&gt; which checks whether &lt;code&gt;.fourth&lt;/code&gt; is a data descriptor, an instance attribute, or a non-data descriptor, in that order. It&amp;#8217;s none of these. Time for the fourth check.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;The &lt;/strong&gt;&lt;em&gt;&lt;strong&gt;fourth&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt; thing &lt;/strong&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;strong&gt; checks is whether the attribute is a class attribute.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And look at that! The &lt;code&gt;.fourth&lt;/code&gt; attribute is indeed a class attribute.&lt;/p&gt;&lt;h3&gt;&lt;strong&gt;5. &lt;/strong&gt;&lt;code&gt;test.fifth&lt;/code&gt;&lt;/h3&gt;&lt;p&gt;And finally, it&amp;#8217;s &lt;code&gt;.fifth&lt;/code&gt;&amp;#8216;s turn. You call &lt;code&gt;print(test.fifth)&lt;/code&gt; and you get the following:&lt;/p&gt;&lt;pre&gt;&lt;code&gt;Printing `test.fifth`
Calling __getattribute__ with argument name=&amp;#8217;fifth&amp;#8217;
Calling __getattr__ with argument name=&amp;#8217;fifth&amp;#8217;
This is the fallback value from __getattr__&lt;/code&gt;&lt;/pre&gt;&lt;p&gt;Yes, yes, there&amp;#8217;s a dot again, so Python calls &lt;code&gt;.__getattribute__()&lt;/code&gt;. No, it&amp;#8217;s not a data descriptor. No, it&amp;#8217;s not an instance attribute. No, it&amp;#8217;s not a non-data descriptor. No, it&amp;#8217;s not a class attribute. It&amp;#8217;s nothing. The attribute &lt;code&gt;.fifth&lt;/code&gt; doesn&amp;#8217;t exist.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;When &lt;/strong&gt;&lt;code&gt;.__getattribute__()&lt;/code&gt;&lt;strong&gt; fails to find the attribute, Python calls &lt;/strong&gt;&lt;code&gt;.__getattr__()&lt;/code&gt;&lt;strong&gt; if this special method exists.&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;And that&amp;#8217;s it. Simple, eh?!?!&lt;/p&gt;&lt;p&gt;Let&amp;#8217;s finish off by summarising the order in which Python deals with looking for attributes when you use the dot notation:&lt;/p&gt;&lt;ol&gt;&lt;li&gt;&lt;p&gt;The &lt;em&gt;first&lt;/em&gt; thing &lt;code&gt;.__getattribute__()&lt;/code&gt; checks is whether the attribute is a data descriptor.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The &lt;em&gt;second&lt;/em&gt; thing &lt;code&gt;.__getattribute__()&lt;/code&gt; checks is whether the attribute is an instance attribute.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The &lt;em&gt;third&lt;/em&gt; thing &lt;code&gt;.__getattribute__()&lt;/code&gt; checks is whether the attribute is a non-data descriptor.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;The &lt;em&gt;fourth&lt;/em&gt; thing &lt;code&gt;.__getattribute__()&lt;/code&gt; checks is whether the attribute is a class attribute.&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;Finally, when &lt;code&gt;.__getattribute__()&lt;/code&gt; fails to find the attribute, Python calls &lt;code&gt;.__getattr__()&lt;/code&gt; if this special method exists.&lt;/p&gt;&lt;/li&gt;&lt;/ol&gt;&lt;p&gt;The fifth step occurs when you access attributes using the dot notation. If you call &lt;code&gt;.__getattribute__() &lt;/code&gt;explicitly, which you rarely need to do, Python doesn&amp;#8217;t automatically look for &lt;code&gt;.__getattr__()&lt;/code&gt;.&lt;/p&gt;&lt;p&gt;Note that Python looks for descriptors, instance and class attributes in the class and its base classes, too, following the method resolution order.&lt;/p&gt;&lt;h2&gt;&lt;strong&gt;Final Words&lt;/strong&gt;&lt;/h2&gt;&lt;p&gt;Wait for this. I&amp;#8217;ve been waiting all article to write this: Now, you won&amp;#8217;t &lt;em&gt;forget&lt;/em&gt; the &lt;em&gt;four &amp;#8220;get&lt;/em&gt;&amp;#8220;* special methods. Got it! (My son just walked out disapprovingly when he read this.)&lt;/p&gt;&lt;p&gt;You may never need to use all of these methods. Maybe you&amp;#8217;ll never need to use any of them. But if you ever wondered why there are these four special methods with similar names, now you know what they do. And you dug a bit more underneath the Python surface along the way. And that&amp;#8217;s always fun.&lt;/p&gt;&lt;div class=&quot;pullquote&quot;&gt;&lt;p&gt;&lt;em&gt;Your call&amp;#8230;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;The Python Coding Place offers something for everyone:&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&amp;#8226; a super-personalised one-to-one 6-month mentoring option&lt;/em&gt;&lt;br /&gt;&lt;em&gt;$ 4,750&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&amp;#8226; individual one-to-one sessions&lt;/em&gt;&lt;br /&gt;&lt;em&gt;$ 125&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&amp;#8226; a self-led route with access to 60+ hrs of exceptional video courses and a support forum&lt;/em&gt;&lt;br /&gt;&lt;em&gt;$ 400&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://thepythoncodingplace.com?utm_source=the-python-coding-stack&quot;&gt;Which The Python Coding Place student are you?&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;/div&gt;&lt;div class=&quot;captioned-image-container&quot;&gt;&lt;a class=&quot;image-link image2 is-viewable-img&quot; target=&quot;_blank&quot; href=&quot;https://substackcdn.com/image/fetch/$s_!FUK2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708dc396-00c0-4ca1-85f4-3a00f69ca4f5_4936x3290.jpeg&quot;&gt;&lt;div class=&quot;image2-inset&quot;&gt;&lt;img src=&quot;https://substackcdn.com/image/fetch/$s_!FUK2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F708dc396-00c0-4ca1-85f4-3a00f69ca4f5_4936x3290.jpeg&quot; width=&quot;628&quot; height=&quot;418.3791208791209&quot; src=&quot;src&quot; /&gt;&lt;div class=&quot;image-link-expand&quot;&gt;&lt;div class=&quot;pencraft pc-display-flex pc-gap-8 pc-reset&quot;&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container restack-image&quot;&gt;&lt;/button&gt;&lt;button tabindex=&quot;0&quot; type=&quot;button&quot; class=&quot;pencraft pc-reset pencraft icon-container view-image&quot;&gt;&lt;/button&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;em&gt;&lt;a href=&quot;https://www.pexels.com/photo/an-abstract-painting-with-orange-and-blue-waves-21243683/&quot;&gt;Photo by Robert Clark&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;p&gt;&lt;em&gt;Code in this article uses Python 3.14&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;The code images used in this article are created using &lt;a href=&quot;https://snappify.cello.so/f4AsFrwgwov&quot;&gt;Snappify&lt;/a&gt;.&lt;/em&gt; &lt;em&gt;[Affiliate link]&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;&lt;strong&gt;&lt;a href=&quot;https://www.thepythoncodingstack.com/subscribe&quot;&gt;Join&lt;/a&gt;&lt;/strong&gt;&lt;/em&gt;&lt;strong&gt;&lt;a href=&quot;https://www.thepythoncodingstack.com/subscribe&quot;&gt; The Club&lt;/a&gt;&lt;/strong&gt;&lt;em&gt;, the exclusive area for paid subscribers for &lt;a href=&quot;https://www.thepythoncodingstack.com/s/the-club&quot;&gt;more Python posts&lt;/a&gt;, videos, a members&amp;#8217; forum, and more.&lt;/em&gt;&lt;/p&gt;&lt;p class=&quot;button-wrapper&quot;&gt;&lt;a class=&quot;button primary&quot; href=&quot;https://www.thepythoncodingstack.com/subscribe&quot;&gt;&lt;span&gt;Subscribe now&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;You can also support this publication by making a &lt;a href=&quot;https://buy.stripe.com/00g3de2iGdgg4gg7su&quot;&gt;one-off contribution of any amount you wish&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;&lt;p class=&quot;button-wrapper&quot;&gt;&lt;a class=&quot;button primary&quot; href=&quot;https://buy.stripe.com/00g3de2iGdgg4gg7su&quot;&gt;&lt;span&gt;Support The Python Coding Stack&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;p&gt;&lt;em&gt;For more Python resources, you can also visit&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;https://realpython.com?utm_source=the-python-coding-stack&quot;&gt;Real Python&lt;/a&gt;&amp;#8212;you may even stumble on one of my own articles or courses there!&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Also, are you interested in technical writing? You&amp;#8217;d like to make your own writing more narrative, more engaging, more memorable? Have a look at&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;http://stephengruppetta.com/breaking-the-rules&quot;&gt;Breaking the Rules&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;&lt;em&gt;And you can find out more about me at&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;https://stephengruppetta.com?utm_source=the-python-coding-stack&quot;&gt;stephengruppetta.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;&lt;p&gt;Further reading related to this article&amp;#8217;s topic:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.thepythoncodingstack.com/p/python-getitem-special-method-library-manorhouse&quot;&gt;The Manor House, the Oak-Panelled Library, the Vending Machine, and Python&amp;#8217;s &lt;/a&gt;&lt;strong&gt;`&lt;/strong&gt;&lt;code&gt;_getitem__()`&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p&gt;&lt;a href=&quot;https://www.thepythoncodingstack.com/p/python-descriptors-the-price-is-right&quot;&gt;The Weird and Wonderful World of Descriptors in Python &amp;#8226; The Price is Right&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;h2&gt;&lt;strong&gt;Appendix: Code Blocks&lt;/strong&gt;&lt;/h2&gt;&lt;h5&gt;&lt;strong&gt;Code Block #1&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)
&amp;#8203;
table = PointsTable(
    {
        &amp;#8220;Stephen&amp;#8221;: 20,
        &amp;#8220;Mark&amp;#8221;: 17,
        &amp;#8220;Kate&amp;#8221;: 19,
        &amp;#8220;Sarah&amp;#8221;: 22,
    }
)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #2&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table[&amp;#8221;Mark&amp;#8221;])&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #3&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)
&amp;#8203;
    def __getitem__(self, item):
        return self._data[item]
&amp;#8203;
table = PointsTable(
    {
        &amp;#8220;Stephen&amp;#8221;: 20,
        &amp;#8220;Mark&amp;#8221;: 17,
        &amp;#8220;Kate&amp;#8221;: 19,
        &amp;#8220;Sarah&amp;#8221;: 22,
    }
)
&amp;#8203;
print(table[&amp;#8221;Mark&amp;#8221;])&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #4&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table[&amp;#8221;Mark&amp;#8221;, &amp;#8220;Stephen&amp;#8221;])&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #5&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)
&amp;#8203;
    def __getitem__(self, item):
        print(f&amp;#8221;{item=}&amp;#8221;)
        print(f&amp;#8221;{type(item)=}&amp;#8221;)
        return self._data[item]
&amp;#8203;
table = PointsTable(
    # ...
)
&amp;#8203;
print(table[&amp;#8221;Mark&amp;#8221;, &amp;#8220;Stephen&amp;#8221;])&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #6&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)
&amp;#8203;
    def __getitem__(self, item):
        if isinstance(item, tuple):
            return tuple(self._data[key] for key in item)
        return self._data[item]
&amp;#8203;
table = PointsTable(
    # ...
)
&amp;#8203;
print(table[&amp;#8221;Mark&amp;#8221;, &amp;#8220;Stephen&amp;#8221;])
print(table[&amp;#8221;Sarah&amp;#8221;, &amp;#8220;Stephen&amp;#8221;, &amp;#8220;Kate&amp;#8221;])
print(table[&amp;#8221;Mark&amp;#8221;])&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #7&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #8&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    # ...
&amp;#8203;
    def __getitem__(self, item):
        print(f&amp;#8221;Calling __getitem__ with argument {item=}&amp;#8221;)
        if isinstance(item, tuple):
            return tuple(self._data[key] for key in item)
        return self._data[item]
&amp;#8203;
    def __getattr__(self, name):
        print(f&amp;#8221;Calling __getattr__ with argument {name=}&amp;#8221;)
        try:
            return self._data[name]
        except KeyError:
            raise AttributeError(
                f&amp;#8221;&amp;#8217;{type(self).__name__}&amp;#8217; object has &amp;#8220;
                f&amp;#8221;no attribute &amp;#8216;{name}&amp;#8217;&amp;#8221;
            ) from None
&amp;#8203;
table = PointsTable(
    # ...
)
&amp;#8203;
print(table[&amp;#8221;Mark&amp;#8221;])
print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #9&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table.Matilda)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #10&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table._data)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #11&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table.Mark)
print(table._data)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #12&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)

    def __getitem__(self, item):
        # ...

    def __getattr__(self, name):
        # ...

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)

table = PointsTable(
    # ...
)

print(table.Mark)
print(table._data)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #13&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table.Mark)
print(table._data)
print(table.Janine)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #14&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)

    def __getitem__(self, item):
        # ...

    def __getattr__(self, name):
        # ...

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)

table = PointsTable(
    # ...
)

print(table._data)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #15&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;# ...
print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #16&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)

    def __getitem__(self, item):
        # ...

    # def __getattr__(self, name):
    #     print(f&amp;#8221;Calling __getattr__ with argument {name=}&amp;#8221;)
    #     try:
    #         return self._data[name]
    #     except KeyError:
    #         raise AttributeError(
    #             f&amp;#8221;&amp;#8217;{type(self).__name__}&amp;#8217; object has &amp;#8220;
    #             f&amp;#8221;no attribute &amp;#8216;{name}&amp;#8217;&amp;#8221;
    #         ) from None

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)

table = PointsTable(
    # ...
)

print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #17&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)
        self.Mark = &amp;#8220;This is Mark as a data attribute&amp;#8221;

    def __getitem__(self, item):
        # ...

    # def __getattr__(self, name):
    # ...

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)

table = PointsTable(
    # ...
)

print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #18&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    Mark = &amp;#8220;This is Mark as a class attribute&amp;#8221;

    def __init__(self, data):
        self._data = dict(data)
        # self.Mark = &amp;#8220;This is Mark as a data attribute&amp;#8221;

    def __getitem__(self, item):
        # ...

    # def __getattr__(self, name):
    # ...

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)

table = PointsTable(
    # ...
)

print(table.Mark)&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #19&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class PointsTable:
    def __init__(self, data):
        self._data = dict(data)

    def __getitem__(self, item):
        print(f&amp;#8221;Calling __getitem__ with argument {item=}&amp;#8221;)
        if isinstance(item, tuple):
            return tuple(self._data[key] for key in item)
        return self._data[item]

    def __getattr__(self, name):
        print(f&amp;#8221;Calling __getattr__ with argument {name=}&amp;#8221;)
        try:
            return self._data[name]
        except KeyError:
            raise AttributeError(
                f&amp;#8221;&amp;#8217;{type(self).__name__}&amp;#8217; object has &amp;#8220;
                f&amp;#8221;no attribute &amp;#8216;{name}&amp;#8217;&amp;#8221;
            ) from None

    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)

table = PointsTable(
    {
        &amp;#8220;Stephen&amp;#8221;: 20,
        &amp;#8220;Mark&amp;#8221;: 17,
        &amp;#8220;Kate&amp;#8221;: 19,
        &amp;#8220;Sarah&amp;#8221;: 22,
    }
)

print(table._data)  # Only .__getattribute__() used
print(table.Mark)  # Fallback .__getattr__() needed here&lt;/code&gt;&lt;/pre&gt;&lt;h5&gt;&lt;strong&gt;Code Block #20&lt;/strong&gt;&lt;/h5&gt;&lt;pre&gt;&lt;code&gt;class DataDescriptor:
    def __get__(self, instance, owner):
        print(&amp;#8221;Calling DataDescriptor.__get__&amp;#8221;)
        return &amp;#8220;This is the Data Descriptor&amp;#8221;
&amp;#8203;
    def __set__(self, instance, value):
        print(f&amp;#8221;Calling DataDescriptor.__set__ with {value=}&amp;#8221;)
&amp;#8203;
class NonDataDescriptor:
    def __get__(self, instance, owner):
        print(&amp;#8221;Calling NonDataDescriptor.__get__&amp;#8221;)
        return &amp;#8220;This is the Non-Data Descriptor&amp;#8221;
&amp;#8203;
class TestingAttributeAccess:
    first = DataDescriptor()
    second = NonDataDescriptor()
    third = NonDataDescriptor()
    fourth = &amp;#8220;I&amp;#8217;m just a normal class attribute!&amp;#8221;
&amp;#8203;
    def __init__(self):
        self.first = &amp;#8220;&amp;#8217;first&amp;#8217;: a data attribute defined in .__init__&amp;#8221;
        self.second = &amp;#8220;&amp;#8217;second&amp;#8217;: a data attribute defined in .__init__&amp;#8221;
&amp;#8203;
    def __getattribute__(self, name):
        print(f&amp;#8221;Calling __getattribute__ with argument {name=}&amp;#8221;)
        return super().__getattribute__(name)
&amp;#8203;
    def __getattr__(self, name):
        print(f&amp;#8221;Calling __getattr__ with argument {name=}&amp;#8221;)
        return &amp;#8220;This is the fallback value from __getattr__&amp;#8221;
&amp;#8203;
test = TestingAttributeAccess()
print(&amp;#8221;\nPrinting `test.first`&amp;#8221;)
print(test.first)
print(&amp;#8221;\nPrinting `test.second`&amp;#8221;)
print(test.second)
print(&amp;#8221;\nPrinting `test.third`&amp;#8221;)
print(test.third)
print(&amp;#8221;\nPrinting `test.fourth`&amp;#8221;)
print(test.fourth)
print(&amp;#8221;\nPrinting `test.fifth`&amp;#8221;)
print(test.fifth)&lt;/code&gt;&lt;/pre&gt;&lt;div&gt;&lt;hr /&gt;&lt;/div&gt;&lt;p&gt;&lt;em&gt;For more Python resources, you can also visit&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;https://realpython.com?utm_source=the-python-coding-stack&quot;&gt;Real Python&lt;/a&gt;&amp;#8212;you may even stumble on one of my own articles or courses there!&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Also, are you interested in technical writing? You&amp;#8217;d like to make your own writing more narrative, more engaging, more memorable? Have a look at&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;http://stephengruppetta.com/breaking-the-rules&quot;&gt;Breaking the Rules&lt;/a&gt;&lt;/em&gt;.&lt;/p&gt;&lt;p&gt;&lt;em&gt;And you can find out more about me at&lt;/em&gt; &lt;em&gt;&lt;a href=&quot;https://stephengruppetta.com?utm_source=the-python-coding-stack&quot;&gt;stephengruppetta.com&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Mon, 04 May 2026 11:59:43 +0000</pubDate>
</item>
<item>
	<title>Daniel Roy Greenfeld: Word counter that ignores Markdown</title>
	<guid>https://daniel.feldroy.com/posts/2026-05-word-counter-that-ignores-markdown</guid>
	<link>https://daniel.feldroy.com/posts/2026-05-word-counter-that-ignores-markdown</link>
	<description>&lt;p&gt;I've been doing a lot of writing recently, and tracking my word count. I write in markdown. I could just render the text using a markdown library and then do a count on the generated output, but then I wouldn't have the fun of writing out a bunch of regular expressions. Yes, I know the cautionary meme by that says:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;&quot;Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.&quot;
-- Jamie Zawinski&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;I don't care.&lt;/p&gt;
&lt;p&gt;I love working in regular expressions. It was the one thing I got out of my brief foray in Perl at the very start of my software development career. I carried it into my Java and ColdFusion days and periodically use it in Python. Yes, Python has lots of useful string tools, but playing with regular expressions until they are just right remains a fun puzzle for me.&lt;/p&gt;
&lt;p&gt;So here you go, a Python-powered word counter powered by my desire to noodle with regular expressions:&lt;/p&gt;
&lt;div class=&quot;codehilite&quot;&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;word_count.py — Count words in a Markdown file or a directory of markdown files.&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;Dependencies:&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    typer&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    rich&lt;/span&gt;

&lt;span class=&quot;sd&quot;&gt;Usage:&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    python word_count.py README.md&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    python word_count.py README.md --no-strip-markdown&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    python word_count.py README.md --verbose&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;    python word_count.py book/&lt;/span&gt;
&lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;re&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;pathlib&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;

&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;typer&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rich.console&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Console&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rich.panel&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Panel&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rich.table&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;kn&quot;&gt;from&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nn&quot;&gt;rich&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;kn&quot;&gt;import&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;box&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;app&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Typer&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;word-count&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Count words in Markdown files.&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;add_completion&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;console&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Console&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;MARKDOWN_PATTERNS&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;```[\s\S]*?```&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# fenced code blocks&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;`[^`]+`&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# inline code&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;!\[.*?\]\(.*?\)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# images&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\[.*?\]\(.*?\)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# links =&amp;gt; keep link text&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^#{1,6}\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# ATX headings&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^\s*[-*+]\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# unordered list markers&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^\s*\d+\.\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# ordered list markers&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[*_]{1,2}([^*_]+)[*_]{1,2}&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# bold / italic =&amp;gt; keep inner text&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;~~([^~]+)~~&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# strikethrough =&amp;gt; keep inner text&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^&amp;gt;+\s*&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# blockquote markers&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^\s*\|.*\|\s*$&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# table rows (kept as-is, words counted)&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^[-*_]{3,}\s*$&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# horizontal rules&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;lt;!--[\s\S]*?--&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# HTML comments&lt;/span&gt;
    &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;lt;[^&amp;gt;]+&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# remaining HTML tags&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;_STRIP_RE&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;compile&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;|&amp;quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;join&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MARKDOWN_PATTERNS&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;strip_markdown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;Remove Markdown syntax, keeping readable prose.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Replace links/images with their label text&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;!\[.*?\]\(.*?\)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\[(.*?)\]\(.*?\)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove fenced code blocks entirely&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;```[\s\S]*?```&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove inline code&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;`[^`]+`&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Unwrap bold / italic&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[*_]{1,2}([^*_\n]+)[*_]{1,2}&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;~~([^~]+)~~&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\1&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Remove HTML comments and tags&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;lt;!--[\s\S]*?--&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;lt;[^&amp;gt;]+&amp;gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c1&quot;&gt;# Strip leading syntax characters&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^#{1,6}\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^\s*[-*+]\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^\s*\d+\.\s+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^&amp;gt;+\s*&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;^[-*_]{3,}\s*$&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;flags&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;MULTILINE&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;count_stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;str&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;split&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;splitlines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;chars_no_space&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sub&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;\s&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sentences&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;re&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;findall&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[.!?]+&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;words&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;lines&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lines&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;chars&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;chars_no_space&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;chars_no_space&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;sentences&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sentences&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;avg_word_len&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;nb&quot;&gt;round&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;words&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;reading_time_min&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;round&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;words&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)),&lt;/span&gt;  &lt;span class=&quot;c1&quot;&gt;# ~200 wpm&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;_count_single_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;dict&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;Count words for a single file, print output, and return stats.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;raw&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;read_text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;encoding&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;utf-8&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;text&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strip_markdown&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;raw&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;raw&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;count_stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;text&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;echo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'words'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;Panel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[bold cyan]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'words'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold cyan] words  ·  &amp;quot;&lt;/span&gt;
                &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[dim]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'reading_time_min'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; min read[/dim]&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[bold]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;name&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold]&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;border_style&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;cyan&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Verbose: full table&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;box&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;box&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ROUNDED&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;show_header&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;header_style&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;bold magenta&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Metric&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;style&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;bold&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_column&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Value&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;justify&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;right&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Words&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'words'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Lines&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'lines'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Characters (with spaces)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'chars'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Characters (no spaces)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'chars_no_space'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Sentences (approx.)&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'sentences'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Average word length&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'avg_word_len'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; chars&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Estimated reading time&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s1&quot;&gt;'reading_time_min'&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; min&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Markdown stripped&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;yes&amp;quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;no&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;add_row&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;label&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;  [bold]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;file&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold]&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;style&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;dim&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;table&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;


&lt;span class=&quot;nd&quot;&gt;@app&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;command&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;def&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nf&quot;&gt;count&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Path&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Argument&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Path to a Markdown file or a directory with digit-prefixed .md files.&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;exists&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;file_okay&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dir_okay&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;readable&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;kc&quot;&gt;True&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;--strip-markdown/--no-strip-markdown&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Strip Markdown syntax before counting (default: True).&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;kc&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;--verbose&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;-v&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Show a full breakdown table.&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;bool&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Option&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
        &lt;span class=&quot;kc&quot;&gt;False&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;s2&quot;&gt;&amp;quot;--plain&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;help&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;Print a bare number (word count only) — useful for scripting.&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;):&lt;/span&gt;
&lt;span class=&quot;w&quot;&gt;    &lt;/span&gt;&lt;span class=&quot;sd&quot;&gt;&amp;quot;&amp;quot;&amp;quot;Count words in a Markdown FILE or all digit-prefixed .md files in a directory.&amp;quot;&amp;quot;&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;():&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;_count_single_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt;

    &lt;span class=&quot;c1&quot;&gt;# Directory mode: find .md files starting with a digit&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;sorted&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;glob&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[0-9]*.md&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;is_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;())&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;not&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[red]No digit-prefixed .md files found in &lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/red]&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;raise&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Exit&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;code&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;total_words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;ow&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_count_single_file&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strip&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;verbose&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;total_words&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;stats&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;words&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;plain&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;typer&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;echo&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;TOTAL&lt;/span&gt;&lt;span class=&quot;se&quot;&gt;\t&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_words&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;console&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;print&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;Panel&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;
                &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[bold green]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_words&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold green] words across &amp;quot;&lt;/span&gt;
                &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[bold]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;files&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold] files  ·  &amp;quot;&lt;/span&gt;
                &lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[dim]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;round&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;total_words&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;w&quot;&gt; &lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;200&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;))&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt; min read[/dim]&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;title&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;sa&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;[bold]&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;path&lt;/span&gt;&lt;span class=&quot;si&quot;&gt;}&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;[/bold] — Total&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;border_style&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;green&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
            &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;


&lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;vm&quot;&gt;__name__&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;__main__&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;app&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;</description>
	<pubDate>Mon, 04 May 2026 11:55:23 +0000</pubDate>
</item>
<item>
	<title>PyCharm: PyTorch vs. TensorFlow: Choosing the Right Framework in 2026</title>
	<guid>https://blog.jetbrains.com/pycharm/2026/05/pytorch-vs-tensorflow-choosing-framework-2026/</guid>
	<link>https://blog.jetbrains.com/pycharm/2026/05/pytorch-vs-tensorflow-choosing-framework-2026/</link>
	<description>&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/05/PC-social-BlogFeatured-1280x720-1.png&quot; alt=&quot;PyTorch vs. TensorFlow&quot; class=&quot;wp-image-704800&quot; /&gt;



&lt;p&gt;Choosing between &lt;a href=&quot;https://pytorch.org/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;PyTorch&lt;/a&gt; and &lt;a href=&quot;https://www.tensorflow.org/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;TensorFlow&lt;/a&gt; isn’t about finding the &amp;#8220;better&amp;#8221; framework – it’s about finding the right fit for your project. Both power cutting-edge AI systems, but they excel in different domains. PyTorch dominates research and experimentation, while TensorFlow leads in production deployment at scale.&lt;/p&gt;



&lt;p&gt;The frameworks have evolved significantly since their early days, each building tools and capabilities to support research and production. Despite these improvements, fundamental differences remain in their philosophies, ecosystems, and ideal use cases, which will naturally influence which framework will best fit your project.&lt;/p&gt;



&lt;p&gt;This guide examines where each framework shines, compares them across key dimensions, and helps you choose the right tool for your natural language processing, computer vision, and reinforcement learning projects.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;What sets PyTorch and TensorFlow apart?&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;PyTorch and TensorFlow took different approaches from day one. Google launched TensorFlow in 2015, focusing on production deployment and enterprise scalability. Meta released PyTorch in 2016, prioritizing research flexibility and Pythonic development. These roots still shape each framework today.&lt;/p&gt;



&lt;p&gt;The key difference between the two lies in computational graphs. PyTorch uses dynamic graphs that execute operations immediately, making debugging natural – you use standard Python tools and inspect tensors at any point. TensorFlow originally required static graphs defined before execution, though version 2.x now defaults to eager execution while retaining optional graph compilation for performance.&lt;/p&gt;



&lt;p&gt;&lt;a href=&quot;https://6sense.com/tech/data-science-machine-learning/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Market data shows&lt;/a&gt; TensorFlow holds a 37% market share, while PyTorch commands 25%. But the research tells a different story: &lt;a href=&quot;https://leapcell.io/blog/tensorflow-vs-pytorch-a-comparative-analysis-for-2025&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;PyTorch powers 85% of deep learning papers&lt;/a&gt; presented at top AI conferences.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;PyTorch: Strengths and weaknesses&lt;/strong&gt;&lt;/h3&gt;



&lt;p&gt;PyTorch’s Pythonic API treats models like regular Python code, making development feel intuitive from the start. The framework’s dynamic computational graphs execute operations immediately rather than requiring upfront model definition, fundamentally changing how you approach debugging and experimentation.&lt;/p&gt;



&lt;p&gt;This design philosophy has made PyTorch the dominant choice in research, where flexibility matters more than deployment infrastructure. However, this research-first design means production deployment tools remain less mature than TensorFlow’s enterprise infrastructure.&lt;/p&gt;



&lt;h4 class=&quot;wp-block-heading&quot;&gt;PyTorch strengths&lt;/h4&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Intuitive, Pythonic API:&lt;/strong&gt; Models use standard Python syntax with minimal framework-specific concepts, reducing the learning curve dramatically compared to other frameworks.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Dynamic graphs enable natural debugging:&lt;/strong&gt; Set breakpoints in training loops, inspect tensor values mid-execution, and modify architectures on the fly using tools you already know.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Priority access to the latest techniques:&lt;/strong&gt; Because of its research dominance, when cutting-edge architectures or methods emerge, they’re implemented in PyTorch before anywhere else.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Strong ecosystem:&lt;/strong&gt; Libraries like PyTorch Lightning handle training loops and best practices automatically, letting you focus on model architecture.&lt;/li&gt;
&lt;/ul&gt;



&lt;h4 class=&quot;wp-block-heading&quot;&gt;PyTorch weaknesses&lt;/h4&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Production deployment tools are less mature:&lt;/strong&gt; Deployment options lag behind TensorFlow’s battle-tested infrastructure, so you need to do more setup work for production systems.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Mobile and edge deployment is limited:&lt;/strong&gt; PyTorch Mobile is functional but less polished than TensorFlow Lite for smartphones and IoT devices.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Dynamic nature complicates optimization:&lt;/strong&gt; The flexibility that aids development can make optimization for production performance harder without additional tools like TorchScript.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Smaller enterprise adoption:&lt;/strong&gt; Fewer production patterns and case studies compared to TensorFlow’s extensive enterprise documentation.&lt;/li&gt;
&lt;/ul&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;TensorFlow: Strengths and weaknesses&lt;/strong&gt;&lt;/h3&gt;



&lt;p&gt;TensorFlow’s production ecosystem provides you with a comprehensive infrastructure for deploying models at scale. Google built the framework specifically for enterprise environments where reliability, performance, and deployment flexibility matter most.&lt;/p&gt;



&lt;p&gt;This production-first approach created mature tooling for serving, mobile optimization, and MLOps that PyTorch is still catching up to. The trade-off comes in development experience – TensorFlow’s API can feel more complex and less intuitive than PyTorch’s streamlined approach.&lt;/p&gt;



&lt;h4 class=&quot;wp-block-heading&quot;&gt;TensorFlow strengths&lt;/h4&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Mature production deployment tools:&lt;/strong&gt; Battle-tested infrastructure with &lt;a href=&quot;https://www.tensorflow.org/tfx/guide/serving&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;TensorFlow Serving&lt;/a&gt; for high-throughput serving, &lt;a href=&quot;https://www.tensorflow.org/lite&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;TensorFlow Lite&lt;/a&gt; for mobile, and TensorFlow.js for browsers.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Superior mobile and edge optimization:&lt;/strong&gt; TensorFlow Lite delivers industry-standard performance and comprehensive device support for smartphones and edge devices.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Strong enterprise adoption:&lt;/strong&gt; Proven production patterns used by thousands of companies, with extensive documentation for scaling systems serving millions of predictions.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Comprehensive MLOps tooling:&lt;/strong&gt; TensorFlow Extended (TFX) gives you end-to-end pipelines for production ML workflows, from data validation through model monitoring.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;TPU support for large-scale training:&lt;/strong&gt; Access to Google’s specialized Tensor Processing Units for training at massive scale with performance advantages over GPU infrastructure.&lt;/li&gt;
&lt;/ul&gt;



&lt;h4 class=&quot;wp-block-heading&quot;&gt;TensorFlow weaknesses&lt;/h4&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Steeper learning curve:&lt;/strong&gt; More complexity when implementing custom models or debugging issues, even with Keras integration simplifying high-level operations.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;More verbose code for custom work:&lt;/strong&gt; Novel architectures or training procedures require significantly more code compared to PyTorch’s streamlined approach.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Larger, less cohesive API:&lt;/strong&gt; Broader API surface with multiple ways to accomplish the same task creates confusion and longer learning curves.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Debugging can be challenging:&lt;/strong&gt; Graph-related issues may require you to understand TensorFlow’s internal execution model despite eager execution improvements.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Slower adoption of research techniques:&lt;/strong&gt; New methods from research papers typically take longer to appear in TensorFlow compared to PyTorch.&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;If you’re new to TensorFlow and want a hands-on starting point, check out &lt;a href=&quot;https://blog.jetbrains.com/pycharm/2026/04/how-to-train-your-first-tensorflow-model/&quot;&gt;How to Train Your First TensorFlow Model in PyCharm&lt;/a&gt;, where you’ll build and train a simple model step by step using Keras and visualize the results.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;PyTorch vs. TensorFlow: Head-to-head comparison&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;Choosing between PyTorch and TensorFlow isn’t always straightforward, and there are many factors to consider.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;The table below provides a high-level head-to-head comparison of PyTorch and TensorFlow so you can quickly assess which framework generally fits your needs. We’ll later consider project-specific scenarios and provide a detailed decision matrix to guide your choice.&lt;/p&gt;



&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Dimension&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;PyTorch&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;TensorFlow&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Learning curve&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Easier: Pythonic and intuitive&lt;/td&gt;&lt;td&gt;Steeper: more complex API despite Keras&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Debugging&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Excellent: standard Python tools work naturally&lt;/td&gt;&lt;td&gt;Good: improved with eager execution&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Production deployment&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Improving: TorchServe and TorchScript available&lt;/td&gt;&lt;td&gt;Excellent: mature ecosystem (Serving, Lite, JS)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Research/experimentation&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Dominant: 85% of deep‑learning research papers&lt;/td&gt;&lt;td&gt;Present: but trailing PyTorch in adoption&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Community ecosystem&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Research-focused: Hugging Face, PyTorch Lightning&lt;/td&gt;&lt;td&gt;Enterprise-focused: TFX, strong cloud integration&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Performance at scale&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Strong: DDP for distributed training&lt;/td&gt;&lt;td&gt;Strong: graph optimization, TPU support&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Industry adoption&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Growing: used by 15,800+ companies&lt;/td&gt;&lt;td&gt;Established: used by more than 23,000 companies&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;PyTorch vs. TensorFlow for different use cases and applications&amp;nbsp;&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;Your framework choice depends heavily on what you’re building. Here’s how PyTorch and TensorFlow stack up for major &lt;a href=&quot;https://blog.jetbrains.com/pycharm/2022/06/start-studying-machine-learning-with-pycharm/&quot;&gt;machine learning&lt;/a&gt; domains.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Natural language processing&lt;/strong&gt;&lt;/h3&gt;



&lt;p&gt;PyTorch dominates NLP with no signs of slowing. The &lt;a href=&quot;https://huggingface.co/docs/transformers&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Hugging Face Transformers&lt;/a&gt; library – the de facto standard for working with language models – started as a PyTorch-only framework and later added TensorFlow support as a secondary option. When you’re fine-tuning transformers, implementing custom attention mechanisms, or experimenting with novel architectures, PyTorch’s flexibility accelerates your iteration.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; PyTorch leads NLP decisively. Choose TensorFlow only if you have specific mobile deployment requirements that override all other considerations.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Computer vision&lt;/strong&gt;&lt;/h3&gt;



&lt;p&gt;Computer vision presents a more balanced landscape for your projects. PyTorch benefits from research momentum – when you’re developing novel detection algorithms or experimenting with architectures, you’ll find state-of-the-art implementations appear in PyTorch first. TensorFlow excels for building production CV systems, especially for mobile object detection or on-device image classification, where TensorFlow Lite’s optimization matters most.&lt;/p&gt;



&lt;p&gt;For a hands-on example, watch this video on how to build a TensorFlow object detection app to see how to take a pre-trained model and turn it into a real-time object detection app running on a robot in PyCharm:&lt;/p&gt;



&lt;div class=&quot;wp-block-embed__wrapper&quot;&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Use case dependent. Choose PyTorch for research and novel architectures, TensorFlow when your deployment priorities favor mobile and edge devices.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Reinforcement learning&lt;/strong&gt;&lt;/h3&gt;



&lt;p&gt;PyTorch holds a slight edge in reinforcement learning, driven by the research community’s preference for it. When you’re implementing custom RL algorithms, modifying reward functions dynamically, or debugging agent behavior, PyTorch’s flexibility serves you better. TensorFlow offers solid capabilities through TF-Agents for production RL systems at scale.&lt;/p&gt;



&lt;p&gt;&lt;strong&gt;Verdict:&lt;/strong&gt; Choose PyTorch for RL research and experimentation or TensorFlow for building large-scale production-grade RL systems like recommendation engines.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Tooling and developer experience in PyCharm&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;PyCharm provides comprehensive support for both frameworks, streamlining your development workflow regardless of which you choose.&lt;/p&gt;



&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/debugging-code.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;Debugging&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Set breakpoints in training loops, inspect tensor values, and step through model forward passes using the integrated debugger that works naturally with PyTorch’s dynamic graphs and TensorFlow’s eager execution.&lt;/li&gt;



&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/jupyter-notebook-support.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;Jupyter notebook support&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Prototype in notebooks, inspect data transformations visually, then move to scripts for production training with seamless integration.&lt;/li&gt;



&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/installing-uninstalling-and-upgrading-packages.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;Package management&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Handle complex dependency trees and CUDA requirements using virtual environment management to prevent conflicts between frameworks.&lt;/li&gt;



&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/configuring-remote-interpreters-via-docker.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;Remote interpreters&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Connect to remote GPU servers, develop locally while training remotely, and sync code automatically to take advantage of powerful hardware without leaving your IDE.&lt;/li&gt;



&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/tensorboard-support.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;TensorBoard integration&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Track training metrics, visualize model graphs, and compare experiments within PyCharm using native TensorFlow support or torch.utils.tensorboard for PyTorch.&lt;/li&gt;



&lt;li&gt;&lt;a href=&quot;https://www.jetbrains.com/help/pycharm/auto-completing-code.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;&lt;strong&gt;Code completion&lt;/strong&gt;&lt;/a&gt;&lt;strong&gt;:&lt;/strong&gt; Get framework-specific suggestions for layer definitions, optimizer configurations, and data pipeline operations that reduce errors and accelerate development.&lt;/li&gt;
&lt;/ul&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Performance, scalability, and deployment&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;Training performance barely differs between frameworks for most workloads – both handle GPU training efficiently with comparable speeds. TensorFlow gains an edge when you need TPU support for large-scale training, offering more mature integration with Google’s specialized hardware. For multi-GPU scaling, both deliver strong performance with PyTorch’s DDP and TensorFlow’s &lt;a href=&quot;https://www.tensorflow.org/api_docs/python/tf/distribute/MirroredStrategy&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;MirroredStrategy&lt;/a&gt;.&lt;/p&gt;



&lt;p&gt;Deployment scenarios differentiate the frameworks more clearly. TensorFlow Serving handles production model serving at scale with built-in versioning and A/B testing that PyTorch’s &lt;a href=&quot;https://docs.pytorch.org/serve/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;TorchServe&lt;/a&gt; can’t yet match in maturity. When deploying to mobile devices or edge hardware, TensorFlow Lite provides industry-standard optimization through quantization and pruning. For browser deployment, TensorFlow.js offers more integrated, optimized inference compared to serving PyTorch models via ONNX Runtime.&lt;/p&gt;



&lt;p&gt;Memory management affects development experience – PyTorch’s caching allocator handles GPU memory efficiently with dynamic batch sizes, causing fewer surprises when experimenting with different model configurations.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Community, ecosystem, and library support&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;PyTorch’s research dominance created a vibrant, innovation-focused community that accelerates development. The &lt;a href=&quot;https://pytorch.org/blog/2024-year-in-review/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;PyTorch Conference 2024 saw triple the registrations&lt;/a&gt; versus 2023, and when cutting-edge techniques emerge, they appear in PyTorch first. The &lt;a href=&quot;https://huggingface.co/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Hugging Face&lt;/a&gt; ecosystem amplifies this advantage – more than 220,000 PyTorch-compatible models versus around 15,000 for TensorFlow makes a tangible difference in development speed.&lt;/p&gt;



&lt;p&gt;TensorFlow’s community skews toward production engineering, providing comprehensive enterprise-grade documentation and proven deployment patterns. Google’s backing ensures strong cloud platform integrations, particularly with Google Cloud, offering managed services that reduce operational complexity. The &lt;a href=&quot;https://www.tensorflow.org/guide/model_garden&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Model Garden&lt;/a&gt; provides production-ready implementations optimized for deployment rather than research experimentation.&lt;/p&gt;



&lt;p&gt;Learning resources reflect these different audiences – PyTorch tutorials emphasize research workflows and novel implementations, while TensorFlow documentation prioritizes production deployment patterns and enterprise-scale systems.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;&lt;strong&gt;Choosing the right framework for your project&lt;/strong&gt;&lt;/h2&gt;



&lt;p&gt;Many successful teams use both frameworks strategically – researching and experimenting in PyTorch, then deploying in TensorFlow. The frameworks aren’t mutually exclusive. You can use &lt;a href=&quot;https://onnx.ai/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;ONNX&lt;/a&gt; to enable model conversion between them when needed.&lt;/p&gt;



&lt;p&gt;When making a choice, it helps to prioritize factors most relevant to your project: Mobile deployment requirements may override other considerations, research-heavy work might make PyTorch essential, and enterprise support with MLOps integration could tip the scales toward TensorFlow.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;Use the table below to match your project requirements with the framework strengths.&amp;nbsp;&lt;/p&gt;



&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Decision Factor&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;PyTorch&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;TensorFlow&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;By use case&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Natural language processing&lt;/td&gt;&lt;td&gt;✅ NLP standard choice&lt;/td&gt;&lt;td&gt;Only if mobile deployment is critical&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Computer vision&lt;/td&gt;&lt;td&gt;✅ Research/novel architectures&lt;/td&gt;&lt;td&gt;✅ Production mobile/edge apps&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Reinforcement learning&lt;/td&gt;&lt;td&gt;✅ Research and experimentation&lt;/td&gt;&lt;td&gt;✅ Large-scale production RL&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;By experience level&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Beginner&lt;/td&gt;&lt;td&gt;✅ More intuitive API&lt;/td&gt;&lt;td&gt;Keras simplifies learning&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Intermediate/Advanced&lt;/td&gt;&lt;td&gt;✅ Research and prototyping&lt;/td&gt;&lt;td&gt;✅ Production systems at scale&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;By project phase&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Research/Experimentation&lt;/td&gt;&lt;td&gt;✅ Dynamic graphs aid iteration&lt;/td&gt;&lt;td&gt;Graph compilation for optimization&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Rapid prototyping&lt;/td&gt;&lt;td&gt;✅ Fast experimentation&lt;/td&gt;&lt;td&gt;Keras for simple models&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Production deployment&lt;/td&gt;&lt;td&gt;TorchServe improving&lt;/td&gt;&lt;td&gt;✅ Mature deployment tools&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;By deployment target&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Cloud/Server&lt;/td&gt;&lt;td&gt;Strong performance&lt;/td&gt;&lt;td&gt;✅ Strong performance, slight GCP advantage&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Mobile/Edge devices&lt;/td&gt;&lt;td&gt;Basic support via PyTorch Mobile&lt;/td&gt;&lt;td&gt;✅ TensorFlow Lite industry standard&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Web Applications&lt;/td&gt;&lt;td&gt;Via ONNX Runtime&lt;/td&gt;&lt;td&gt;✅ TensorFlow.js optimized&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;By team context&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Research-focused team&lt;/td&gt;&lt;td&gt;✅ Natural fit for researchers&lt;/td&gt;&lt;td&gt;If already using TensorFlow&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Production-focused team&lt;/td&gt;&lt;td&gt;If comfortable with tooling&lt;/td&gt;&lt;td&gt;✅ Proven enterprise patterns&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;</description>
	<pubDate>Mon, 04 May 2026 10:07:20 +0000</pubDate>
</item>
<item>
	<title>Python Engineering at Microsoft: Introducing Apache Arrow Support in mssql-python</title>
	<guid>https://devblogs.microsoft.com/python/introducing-apache-arrow-support-in-mssql-python/</guid>
	<link>https://devblogs.microsoft.com/python/introducing-apache-arrow-support-in-mssql-python/</link>
	<description>&lt;p&gt;&lt;span&gt;&lt;!--ScriptorStartFragment--&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;
&lt;p&gt;&lt;span&gt;&lt;a href=&quot;https://devblogs.microsoft.com/azure-sql/wp-content/uploads/sites/56/2025/07/c1014e61-a66d-4807-ab58-655671044f49.png&quot;&gt;&lt;img class=&quot;alignnone wp-image-5513 size-large&quot; src=&quot;https://devblogs.microsoft.com/azure-sql/wp-content/uploads/sites/56/2025/07/c1014e61-a66d-4807-ab58-655671044f49-1024x519.png&quot; alt=&quot;c1014e61 a66d 4807 ab58 655671044f49 image&quot; width=&quot;1024&quot; height=&quot;519&quot; /&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;span&gt;Reviewed by Sumit Sarabhai&lt;/span&gt;&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Fetching a million rows from SQL Server into a &lt;a href=&quot;https://pola.rs/&quot;&gt;Polars&lt;/a&gt; DataFrame used to mean a million Python objects, a million GC allocations, and then throwing it all away to build a DataFrame. Not anymore. &lt;a href=&quot;https://github.com/microsoft/mssql-python&quot;&gt;mssql-python&lt;/a&gt; now supports fetching SQL Server data directly as Apache Arrow structures &amp;#8211; a faster and more memory-efficient path for anyone working with SQL Server data in Polars, Pandas, DuckDB, or any other Arrow-native library. This feature was contributed by community developer &lt;strong&gt;Felix Graßl (@ffelixg)&lt;/strong&gt;, and we are thrilled to ship it.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;div class=&quot;alert alert-primary&quot;&gt;&lt;p class=&quot;alert-divider&quot;&gt;&lt;i class=&quot;fabric-icon fabric-icon--Info&quot;&gt;&lt;/i&gt;&lt;strong&gt;Key Terms&lt;/strong&gt;&lt;/p&gt;&lt;strong&gt;API (Application Programming Interface):&lt;/strong&gt; a source-code contract that defines how to call a function or library.&lt;/div&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;ABI (Application Binary Interface):&lt;/strong&gt; a binary-level contract that specifies how compiled code is laid out in memory. Two programs built in different languages can share an ABI and exchange data directly &amp;#8211; no serialization is needed.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Arrow C Data Interface:&lt;/strong&gt; Apache Arrow&amp;#8217;s ABI specification &amp;#8211; the standard that makes zero-copy data exchange between languages possible.&lt;/span&gt;&lt;/p&gt;&lt;/div&gt;
&lt;h2&gt;&lt;span&gt;What Is Apache Arrow?&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;The key insight behind &lt;a href=&quot;https://arrow.apache.org/&quot;&gt;Apache Arrow&lt;/a&gt; is &lt;strong&gt;zero-copy language interoperability&lt;/strong&gt;. Arrow defines a stable shared-memory layout &amp;#8211; the &lt;strong&gt;Arrow C Data Interface&lt;/strong&gt;, a cross-language ABI &amp;#8211; that any language can produce or consume by exchanging a pointer, with no serialization, no copies, and no re-parsing. A C++ database driver and a Python DataFrame library can work on the exact same memory without either one knowing about the other.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Built on top of that, Arrow uses a &lt;strong&gt;columnar in-memory format&lt;/strong&gt;: instead of representing a table as a list of rows, each row a collection of Python objects, Arrow stores all values for a column contiguously in a typed buffer. Nulls are tracked in a compact bitmap rather than per-cell &lt;code&gt;None&lt;/code&gt; objects.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;For a database driver, this means the entire fetch loop can run in C++ and write values directly into Arrow buffers &amp;#8211; no Python object creation per row, no garbage-collector pressure. The DataFrame library receives a pointer to that memory and can begin operating on it immediately. Crucially, subsequent operations &amp;#8211; filters, joins, aggregations &amp;#8211; also work in-place on those same buffers. A Polars pipeline reading from mssql-python never needs to materialize intermediate Python objects at any stage, making Arrow the right foundation for high-throughput data processing pipelines.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;For users of mssql-python, this translates into four concrete benefits:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;strong&gt;Speed:&lt;/strong&gt; The columnar fetch path avoids Python object creation per row, which should make fetching noticeably faster for many SQL Server types &amp;#8211; especially temporal types like &lt;code&gt;DATETIME&lt;/code&gt; and &lt;code&gt;DATETIMEOFFSET&lt;/code&gt;, where Python-side per-value conversions are eliminated entirely.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;strong&gt;Lower memory usage:&lt;/strong&gt; A column of one million integers is a single contiguous C array, not a million individual Python objects.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;strong&gt;Seamless interoperability:&lt;/strong&gt; Polars, Pandas (via &lt;code&gt;ArrowDtype&lt;/code&gt;), DuckDB, Hugging Face datasets, and many other libraries all speak Arrow natively. Zero-copy hand-off between mssql-python and those tools.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;strong&gt;Purely additive:&lt;/strong&gt; Your existing &lt;code&gt;fetchone&lt;/code&gt;, &lt;code&gt;fetchmany&lt;/code&gt;, and &lt;code&gt;fetchall&lt;/code&gt; code is completely unaffected. You opt in only where you need it.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span&gt;&lt;div class=&quot;alert alert-warning&quot;&gt;&lt;strong&gt;Try it here: &lt;a href=&quot;https://github.com/microsoft/mssql-python&quot;&gt;pip install mssql-python&lt;/a&gt;&lt;/strong&gt;&lt;/div&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;Calling all Python + SQL developers! We invite the community to try out mssql-python and help us shape the future of high-performance &lt;a href=&quot;https://github.com/microsoft/mssql-python&quot;&gt;SQL Server connectivity in Python&lt;/a&gt;.!&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span&gt;The Arrow Fetch APIs&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;Three APIs have been added to the &lt;code&gt;Cursor&lt;/code&gt; object.&lt;/span&gt;&lt;/p&gt;
&lt;h4&gt;&lt;span&gt;&lt;code&gt;1. cursor.arrow_batch(batch_size=8192)&lt;/code&gt; → &lt;code&gt;pyarrow.RecordBatch&lt;/code&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span&gt;Fetches the next batch of up to &lt;code&gt;batch_size&lt;/code&gt; rows as an Arrow &lt;code&gt;RecordBatch&lt;/code&gt; and advances the cursor. &lt;code&gt;RecordBatch&lt;/code&gt;es are the building block for more high-level Arrow data types like tables and the batch reader interface.&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;import mssql_python

conn   = mssql_python.connect(conn_str)
cursor = conn.cursor()
cursor.execute(&quot;SELECT * FROM SalesData&quot;)

partial_data = cursor.arrow_batch(batch_size=50000)
process(partial_data)   # pyarrow.RecordBatch&lt;/pre&gt;
&lt;h4&gt;&lt;span&gt;&lt;code&gt;2. cursor.arrow(batch_size=8192)&lt;/code&gt; → &lt;code&gt;pyarrow.Table&lt;/code&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span&gt;Eagerly fetches the entire result set into a single Arrow &lt;code&gt;Table&lt;/code&gt;. This is the simplest path and works well for analytics queries where the result fits comfortably in memory. However, because it materialises the full result set at once, it can cause high peak RAM usage or out-of-memory errors on very large or unbounded queries. For large exports or ETL workloads, prefer &lt;code&gt;cursor.arrow_reader()&lt;/code&gt; (streaming, fetches lazily) or &lt;code&gt;cursor.arrow_batch()&lt;/code&gt; (fetch one batch at a time). In both cases, &lt;code&gt;batch_size&lt;/code&gt; is a tuning knob: larger batches improve throughput but increase peak memory; smaller batches reduce memory at the cost of slightly more per-batch overhead.&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;cursor.execute(&quot;SELECT customer_id, order_date, amount FROM Orders&quot;)
table = cursor.arrow()

# Zero-copy conversion to Polars
import polars as pl
df = pl.DataFrame(table)

# Or to Pandas with Arrow-backed dtypes
import pandas as pd
df = table.to_pandas(types_mapper=pd.ArrowDtype)&lt;/pre&gt;
&lt;h4&gt;&lt;span&gt;&lt;code&gt;3. cursor.arrow_reader(batch_size=8192)&lt;/code&gt; → &lt;code&gt;pyarrow.RecordBatchReader&lt;/code&gt;&lt;/span&gt;&lt;/h4&gt;
&lt;p&gt;&lt;span&gt;Returns a lazy &lt;code&gt;RecordBatchReader&lt;/code&gt;. Batches are fetched only when the reader is iterated, enabling streaming over very large result sets. &lt;code&gt;RecordBatchReader&lt;/code&gt; is also accepted directly by DuckDB, Lance, and other Arrow-native libraries.&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;cursor.execute(&quot;SELECT * FROM LargeEventLog&quot;)
reader = cursor.arrow_reader(batch_size=100000)

for batch in reader:
    sink.write(batch)&lt;/pre&gt;
&lt;h2&gt;&lt;span&gt;Testing&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;We validated the Arrow fetch path against the standard Python row fetch path across a range of SQL Server types — numeric, temporal, string, and UUID &amp;#8211; for both single-column and wide (20-column) tables. The full test script and results are available in the Resources section; we encourage you to run them on your own hardware to see the difference for your workload.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;In our testing, the Arrow path was consistently faster for most SQL Server types. Temporal types showed the largest gains: types like &lt;code&gt;DATETIME&lt;/code&gt; and &lt;code&gt;DATETIMEOFFSET&lt;/code&gt; benefit significantly because the Arrow path handles timezone normalization and value encoding entirely in C++, eliminating per-value Python-side conversions. &lt;code&gt;DATETIMEOFFSET&lt;/code&gt; in particular showed some of the most pronounced speedups we observed.&lt;/span&gt;&lt;/p&gt;
&lt;h5&gt;&lt;span&gt;JSON Serialization Bonus&lt;/span&gt;&lt;/h5&gt;
&lt;p&gt;&lt;span&gt;The Arrow path can also benefit API workloads that serialize results to JSON. Instead of &lt;code&gt;fetchall()&lt;/code&gt; + &lt;code&gt;json.dumps()&lt;/code&gt;, fetch via &lt;code&gt;cursor.arrow()&lt;/code&gt;, wrap in a Polars DataFrame, and call &lt;code&gt;df.write_json()&lt;/code&gt; &amp;#8211; the entire pipeline bypasses Python objects and can be noticeably faster, especially for types like &lt;code&gt;DATETIMEOFFSET&lt;/code&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;div class=&quot;alert alert-warning&quot;&gt;&lt;p class=&quot;alert-divider&quot;&gt;&lt;i class=&quot;fabric-icon fabric-icon--Warning&quot;&gt;&lt;/i&gt;&lt;strong&gt;NVARCHAR on Linux&lt;/strong&gt;&lt;/p&gt;Our Linux tests show longer fetch times for &lt;code&gt;NVARCHAR&lt;/code&gt; due to the current UTF-16 → UTF-8 conversion path. On Windows, &lt;code&gt;NVARCHAR&lt;/code&gt; fetches consistently faster with Arrow. A fix is targeted for a follow-up release.&lt;/div&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span&gt;Getting Started&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;Install or upgrade mssql-python, then add pyarrow:&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;pip install mssql-python pyarrow&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;For IDE type hints and static type checking:&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;pip install pyarrow-stubs&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;Then swap in &lt;code&gt;cursor.arrow()&lt;/code&gt; wherever you would have called &lt;code&gt;fetchall()&lt;/code&gt; and converted to a DataFrame. Your existing code is completely unaffected — Arrow support is purely additive.&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;import mssql_python
import polars as pl

conn   = mssql_python.connect(conn_str)
cursor = conn.cursor()

cursor.execute(&quot;SELECT * FROM dbo.LargeSalesTable&quot;)
df = pl.DataFrame(cursor.arrow())

print(df.describe())&lt;/pre&gt;
&lt;h2&gt;&lt;span&gt;What&amp;#8217;s Next&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;One known area we are actively working on to improve is &lt;a href=&quot;https://github.com/microsoft/mssql-python/pull/526&quot;&gt;NVARCHAR performance on Linux&lt;/a&gt;. SQL Server returns Unicode string data in UTF-16 encoding, which the driver must convert to UTF-8 before handing it to Arrow. On Windows this conversion uses a native system API that is very fast, but the current Linux code path goes through a slower chain of intermediate steps. As a result, &lt;code&gt;NVARCHAR&lt;/code&gt; columns on Linux show longer fetch times compared to the Python fetch path — the opposite of every other type. A fix using a more efficient codec is in progress for a follow-up release. On Windows, our tests show &lt;code&gt;NVARCHAR&lt;/code&gt; fetching noticeably faster with Arrow, and Linux will follow.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span&gt;A Note of Thanks&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span&gt;This feature was contributed by &lt;strong&gt;Felix Graßl (@ffelixg)&lt;/strong&gt;, the author of &lt;a href=&quot;https://github.com/ffelixg/zodbc&quot;&gt;zodbc&lt;/a&gt;, his own Zig-based ODBC driver. His deep familiarity with ODBC and Arrow made this a thorough, well-tested contribution covering both Linux and Windows, and all three fetch patterns. We are very grateful for his work and the care he brought to this feature.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span&gt;Resources&lt;/span&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;a href=&quot;https://github.com/microsoft/mssql-python&quot;&gt;mssql-python on GitHub&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;a href=&quot;https://github.com/microsoft/mssql-python/pull/354&quot;&gt;Arrow Support PR&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;a href=&quot;https://gist.github.com/ffelixg/cb302e606920c88f5450f9fd20758e86&quot;&gt;Full test script and results&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;a href=&quot;https://arrow.apache.org/docs/format/CDataInterface.html&quot;&gt;Apache Arrow C Data Interface&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;a href=&quot;https://github.com/ffelixg/zodbc&quot;&gt;zodbc — Felix&amp;#8217;s Zig-based ODBC driver&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;strong&gt;&lt;span&gt;Try It and Share Your Feedback! &lt;/span&gt;&lt;/strong&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span&gt;We invite you to: &lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span&gt;Check-out the &lt;a href=&quot;https://github.com/microsoft/mssql-python&quot;&gt;mssql-python &lt;/a&gt;driver and integrate it into your projects. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;Share your thoughts: Open &lt;a href=&quot;https://github.com/microsoft/mssql-python/issues&quot;&gt;issues&lt;/a&gt;, suggest features, and contribute to the project. &lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;Join the conversation: &lt;a href=&quot;https://github.com/microsoft/mssql-python/discussions&quot;&gt;GitHub Discussions&lt;/a&gt; | &lt;a href=&quot;https://techcommunity.microsoft.com/category/sql-server/blog/sqlserver&quot;&gt;SQL Server Tech Community&lt;/a&gt;. &lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span&gt;&lt;div class=&quot;alert alert-success&quot;&gt;&lt;p class=&quot;alert-divider&quot;&gt;&lt;i class=&quot;fabric-icon fabric-icon--Lightbulb&quot;&gt;&lt;/i&gt;&lt;strong&gt;Use Python Driver with Free Azure SQL Database&lt;/strong&gt;&lt;/p&gt;You can use the Python Driver with the free version of Azure SQL Database!&lt;/div&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;img src=&quot;https://s.w.org/images/core/emoji/17.0.2/72x72/2705.png&quot; alt=&quot;✅&quot; class=&quot;wp-smiley&quot; /&gt; &lt;a href=&quot;https://learn.microsoft.com/en-us/azure/azure-sql/database/free-offer?view=azuresql&quot;&gt;Deploy Azure SQL Database for free&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;img src=&quot;https://s.w.org/images/core/emoji/17.0.2/72x72/2705.png&quot; alt=&quot;✅&quot; class=&quot;wp-smiley&quot; /&gt; &lt;a href=&quot;https://learn.microsoft.com/en-us/azure/azure-sql/managed-instance/free-offer?view=azuresql&quot;&gt;Deploy Azure SQL Managed Instance for free&lt;/a&gt; Perfect for testing, development, or learning scenarios without incurring costs.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span&gt;&lt;div class=&quot;alert alert-primary&quot;&gt;Have questions or feedback? Open an &lt;a href=&quot;https://github.com/microsoft/mssql-python/issues&quot;&gt;issue&lt;/a&gt; or &lt;a href=&quot;https://github.com/microsoft/mssql-python/discussions&quot;&gt;discussion&lt;/a&gt; on GitHub, or reach out to the team at &lt;a href=&quot;mailto:mssql-python@microsoft.com&quot;&gt;mssql-python@microsoft.com&lt;/a&gt;&lt;/div&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;The post &lt;a href=&quot;https://devblogs.microsoft.com/python/introducing-apache-arrow-support-in-mssql-python/&quot;&gt;Introducing Apache Arrow Support in mssql-python&lt;/a&gt; appeared first on &lt;a href=&quot;https://devblogs.microsoft.com/python&quot;&gt;Microsoft for Python Developers Blog&lt;/a&gt;.&lt;/p&gt;</description>
	<pubDate>Mon, 04 May 2026 04:33:00 +0000</pubDate>
</item>
<item>
	<title>&quot;Michiel's Blog&quot;: Talk at PyGrunn on httpxyz</title>
	<guid>https://tildeweb.nl/~michiel/pygrunn-httpxyz.html</guid>
	<link>https://tildeweb.nl/~michiel/pygrunn-httpxyz.html</link>
	<description>&lt;p&gt;On Friday 8th of May 2026 I will be giving a talk on our new fork of the
popular python package &lt;strong&gt;httpx&lt;/strong&gt; called &lt;a href=&quot;https://httpxyz.org&quot;&gt;httpxyz&lt;/a&gt; at
&lt;a href=&quot;https://pygrunn.org&quot;&gt;PyGrunn&lt;/a&gt;. &lt;strong&gt;PyGrunn&lt;/strong&gt; is a full day Python (&amp;ldquo;and
friends&amp;rdquo;) conference in Groningen, The Netherlands. &lt;strong&gt;httpx&lt;/strong&gt; is a top-100
Python package for sending http requests but has not had a release since end of
2024, plus, recently all issues were set to hidden and all discussions are
closed. &lt;strong&gt;httpxyz&lt;/strong&gt; is our friendly fork with lots of fixes for serious and
more niche issues. Read for more info my
&lt;a href=&quot;https://tildeweb.nl/~michiel/tags/python/httpxyz.html&quot;&gt;announcement post for httpxyz&lt;/a&gt;. I will talk about why we did the
fork and how we approach it. I&amp;rsquo;ll also delve into how to build a performant API
client in Python and technical details of HTTP.&lt;/p&gt;
&lt;p&gt;This is an expanded version of the talk I gave at
&lt;a href=&quot;https://tildeweb.nl/~michiel/tags/python/pyamsterdam-httpxyz.html&quot;&gt;PyAmsterdam&lt;/a&gt; last week, and is part of my &amp;lsquo;promotion
efforts&amp;rsquo; for &lt;a href=&quot;https://httpxyz.org&quot;&gt;httpxyz&lt;/a&gt;. I look forward to giving my talk &amp;amp;
see you there!&lt;/p&gt;
&lt;p&gt;&lt;img alt=&quot;Me presenting at PyAmsterdam&quot; class=&quot;small&quot; src=&quot;https://tildeweb.nl/~michiel/tags/python/images/mwb-zstandard.jpg&quot; /&gt;&lt;/p&gt;</description>
	<pubDate>Sun, 03 May 2026 20:00:00 +0000</pubDate>
</item>
<item>
	<title>PyCon: Asking the Key Questions: Q&amp;amp;A with the PyCon US 2026 keynote speaker Pablo Galindo Salgado</title>
	<guid>https://pycon.blogspot.com/2026/05/asking-key-questions-q-with-pycon-us_3.html</guid>
	<link>https://pycon.blogspot.com/2026/05/asking-key-questions-q-with-pycon-us_3.html</link>
	<description>&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;&lt;i&gt;&lt;span&gt;This is a blog series where we're asking each of our PyConUS 2026 keynote speakers about their journey into tech, how excited they are for PyconUS and any tips they can provide for an awesome conference experience! 
&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/h2&gt;&lt;span id=&quot;docs-internal-guid-b8eeb924-7fff-d507-a85c-820ef710f3e0&quot;&gt;&lt;h2&gt;&lt;span&gt;&lt;i&gt;&lt;span&gt;&lt;b&gt;Thank you Pablo for this interview! You can learn more about Pablo's keynote on the &lt;a href=&quot;https://us.pycon.org/2026/about/keynote-speakers/&quot; target=&quot;_blank&quot;&gt;PyConUS Keynote Speakers page&lt;/a&gt; and you can also attend &lt;/b&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;i&gt;Pablo Galindo Salgado&lt;/i&gt;&lt;i&gt;&lt;span&gt;&lt;b&gt;s meet and greet at the PSF Booth in the Expo Hall on Saturday May 16 after Pablo's keynote. &lt;/b&gt;&lt;/span&gt;&lt;/i&gt;&lt;/h2&gt;&lt;div class=&quot;separator&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqksJz6fpBjb5gAen81QqN06rdy9UFkePrHnTY6jOpGFLs2I5NDxi30PJm5Q5XL7qopqwDcY0SYqCLKdiyf6OcVWr8l1wS3iTSi4UtIaGbKGQS6L42_IBWfv6Touv5yiKIlMHoQmyFxGgsGdnJjG450P5aOd16PNE0qZfSMijgnR4MFIw6JC1sZw/s1080/Pablo%20Galindo%20Salgado.png&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiqksJz6fpBjb5gAen81QqN06rdy9UFkePrHnTY6jOpGFLs2I5NDxi30PJm5Q5XL7qopqwDcY0SYqCLKdiyf6OcVWr8l1wS3iTSi4UtIaGbKGQS6L42_IBWfv6Touv5yiKIlMHoQmyFxGgsGdnJjG450P5aOd16PNE0qZfSMijgnR4MFIw6JC1sZw/s320/Pablo%20Galindo%20Salgado.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;b&gt;How did you get started in tech/Python? Did you have a friend or a mentor that helped you?&lt;/b&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;I got into tech through the back door: as part of my Physics studies, needing to write code to run simulations and process data. The simulations themselves were in Fortran 77 and C++ but for the rest I tried a bunch of languages before landing on Python, but Python had something the others didn't: it was genuinely fun. And then I discovered the community, and that was it. I didn't have a single mentor so much as a whole constellation of generous people in the Python world. The core dev team is full of some of the most talented and kind people I've ever met, and I learn from them every single day.&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;&lt;b&gt;What do you think the most important work you've ever done is?&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;Honestly, a tough one. I've done technical work I'm proud of: the PEG parser, better error messages, performance and memory profilers, debuggers, work in the garbage collector, the Steering Council... but if I'm being real, the human contributions matter more to me than the technical ones. The contributors I've mentored who became core developers themselves. The talk that made someone feel like they could contribute too. The code will always be there (or not!) but helping people feel welcome and capable in this community is the work that actually keeps me going.&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;&lt;b&gt;Have you been to PyCon US before? What are you looking forward to?&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;This will be my sixth PyCon US! I should probably be blasé about it by now, but I'm genuinely not. Every year I get that same rush walking into a room full of people who care as deeply about this stuff as I do. What I'm most looking forward to is seeing everyone: there are so many people in this community I only get to see once a year, and those reunions mean the world to me. And the conversations. The hallway conversations, the late-night ones, the ones that start at a talk and end up somewhere completely unexpected. That's where the magic happens.&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;&lt;b&gt;Do you have any advice for first-time conference goers?&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;Talk to people! The hallway track is real and it's where some of the best things at PyCon happen. Introduce yourself, go to the social events, ask questions after talks: everyone here is friendly and almost everyone remembers what it felt like to be new. And please, go to the Sprints. They are so underrated. You don't need to be an expert, you just need to show up and people will help you find something to work on, and it might just be the start of something big. Finally, be kind to yourself. You won't see everything and that's okay. Pick what excites you, let yourself be surprised, and enjoy being part of something wonderful.&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;&lt;b&gt;Can you tell us about an open source project not enough people know about?&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;br /&gt;&lt;p dir=&quot;ltr&quot;&gt;&lt;span&gt;A twist: &lt;a href=&quot;https://github.com/python/cpython&quot; target=&quot;_blank&quot;&gt;CPython&lt;/a&gt; itself. Everyone knows about CPython, but I don't think people really know it as a community : a place where real humans show up every day and do imperfect, collaborative, joyful work together. There's a persistent myth that core developers are geniuses in an ivory tower who never make mistakes. I want to bust that completely. We are normal people. We make mistakes, we don't always know the answers, we learn from each other, and we have an enormous amount of fun. How CPython gets built is still a mystery to many Python developers, but it really doesn't need to be. The project is open, the conversations are public, and the door is open to anyone who wants to contribute. Come take a look! You might be surprised at how human it all is.&lt;/span&gt;&lt;/p&gt;&lt;div&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;</description>
	<pubDate>Sun, 03 May 2026 15:29:30 +0000</pubDate>
</item>
<item>
	<title>Real Python: Quiz: Revisit Python Fundamentals</title>
	<guid>https://realpython.com/quizzes/revisit-python-fundamentals/</guid>
	<link>https://realpython.com/quizzes/revisit-python-fundamentals/</link>
	<description>&lt;p&gt;In this quiz, you&amp;rsquo;ll revisit the core concepts covered in the &lt;a href=&quot;https://realpython.com/learning-paths/python3-introduction/&quot;&gt;Revisit Python Fundamentals&lt;/a&gt; learning path. The 15 questions span variables, data types, operators, expressions, keywords, and exceptions, giving you a way to check that you understood the most important ideas.&lt;/p&gt;
&lt;p&gt;Take your time and revisit any topics that feel rusty before moving on to the next learning path.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Sun, 03 May 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>PyCon: Asking the Key Questions: Q&amp;amp;A with the PyCon US 2026 keynote speaker amanda casari</title>
	<guid>https://pycon.blogspot.com/2026/05/asking-key-questions-q-with-pycon-us.html</guid>
	<link>https://pycon.blogspot.com/2026/05/asking-key-questions-q-with-pycon-us.html</link>
	<description>&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;h2&gt;&lt;span&gt;&lt;span&gt;&lt;i&gt;&lt;span&gt;This is a blog series where we're asking each of our PyConUS 2026 keynote speakers about their journey into tech, how excited they are for PyconUS and any tips they can provide for an awesome conference experience! 
&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h2&gt;&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;i&gt;&lt;span&gt;&lt;b&gt;Thank you amanda for this interview! You can learn more about amanda's keynote on the &lt;a href=&quot;https://us.pycon.org/2026/about/keynote-speakers/&quot; target=&quot;_blank&quot;&gt;PyConUS Keynote Speakers page&lt;/a&gt; and you can also attend amanda's meet and greet at the PSF Booth in the Expo Hall on Thursday May 14 during the opening reception at 5 - 6pm PT. &lt;/b&gt;&lt;/span&gt;&lt;/i&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;&lt;div class=&quot;separator&quot;&gt;&lt;a href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMxxIJaBGSo_LsefYBgWkDUSFqzw5U7Zt8v9_CXf5rzelAXjpexEhuDFc7QLA1o8x44yWTrpzFDy4diHM8LhYbY3SwCcZRha25hUsnjEhTDoFWFlUEJOCl1ghsxGM_dbj3_IBVWQvJrn79-cCIG5buvmhXalKLS-3b8BSqPjRUwEVzysJcw3ID0Q/s1080/amanda%20casari.png&quot;&gt;&lt;img border=&quot;0&quot; height=&quot;320&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjMxxIJaBGSo_LsefYBgWkDUSFqzw5U7Zt8v9_CXf5rzelAXjpexEhuDFc7QLA1o8x44yWTrpzFDy4diHM8LhYbY3SwCcZRha25hUsnjEhTDoFWFlUEJOCl1ghsxGM_dbj3_IBVWQvJrn79-cCIG5buvmhXalKLS-3b8BSqPjRUwEVzysJcw3ID0Q/s320/amanda%20casari.png&quot; width=&quot;320&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;i&gt;Without giving any too many spoilers, tell us what your keynote is about? &lt;/i&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;More and more these days, amanda is asking how do you make space in open source for hope.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;i&gt;How did you get started in tech/Python? Did you have a friend or a mentor that helped you?&lt;/i&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;My first time wrestling with Python was in 2009 when I was struggling to set up a webserver for a graduate student project building a microgrid testbed for a local national park. When I moved to Seattle a few years later, the local Python tech community was extremely welcoming, friendly, and really focused on bringing people together to make them feel connected. I especially grateful to &lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;My first time wrestling with Python was in 2009 when I was struggling to set up a webserver for a graduate student project building a microgrid testbed for a local national park. When I moved to Seattle a few years later, the local Python tech community was extremely welcoming, friendly, and really focused on bringing people together to make them feel connected. I especially grateful to &lt;a href=&quot;https://seattle.pyladies.com/&quot; target=&quot;_blank&quot;&gt;PyLadies Seattle&lt;/a&gt; leader Wendy Grus, and later Carol Willing, for entertaining and celebrating with me so many silly ideas.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;
&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;i&gt;&lt;b&gt;What do you think the most important work you’ve ever done is? Or if you think it might still be in the future, can you tell us something about your plans?&lt;/b&gt;&lt;/i&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;It's hard for me to judge what the most important work I've ever done is, or will be. So much of my work is a series of incremental changes or decisions made with a goal to impact what is next, rather than what is nearby. What I will always be most proud of is when I'm given the opportunity to build teams with other people who challenge me. The most important is always the people, and how we spend the time together when our lives intersect.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;i&gt;Have you been to PyCon US before? What are you looking forward to?&lt;/i&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;YES! In no particular order, I'm looking forward to: community booth time, meals with old friends, meeting new friends, and finally giving the 5K a noble effort.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;i&gt;Do you have any advice for first-time conference goers or any general conference tips?&lt;/i&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Pick at least one talk or session that is completely new to you, or that you have no idea whether or not it intersects with your interests. If it's a low-volume crowd, sit near the front, and be the silent, attentive, and encouraging audience member that every speaker needs.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;&lt;i&gt;Can you tell us about an open source or open culture project that you think not enough people know about?&lt;/i&gt;&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/h4&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;I'm a MASSIVE space nerd. Last year I finally learned about &lt;a href=&quot;https://www.rtems.org/&quot; target=&quot;_blank&quot;&gt;RTEMS&lt;/a&gt;, and now I'm obsessed. Everyone talked about the proprietary software failure from the recent Artemis II launch, but they SHOULD have been talking about RTEMS being onboard!!! As a successful open source project that's been running for over 30 years, I want everyone to know about this.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;i&gt;&lt;br /&gt;&lt;/i&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;</description>
	<pubDate>Sat, 02 May 2026 12:48:56 +0000</pubDate>
</item>
<item>
	<title>Rodrigo Girão Serrão: TIL #144 – Sentinel built-in</title>
	<guid>https://mathspp.com/blog/til/sentinel-builtin</guid>
	<link>https://mathspp.com/blog/til/sentinel-builtin</link>
	<description>&lt;img alt=&quot;&quot; src=&quot;https://mathspp.com/images/7/d/f/1/7/7df17cf0130552dd1d8854e6ad9f82d635af26d2-thumbnail.webp&quot; /&gt;
                                &lt;p&gt;Today I learned Python 3.15 will get a new sentinel built-in.&lt;/p&gt;

&lt;p&gt;Sentinel values are unique placeholder values that are commonly used in programming.
Python 3.15 ships with a new built-in &lt;code&gt;sentinel&lt;/code&gt; that can be used to create new sentinel values:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-py&quot;&gt;# Python 3.15+
&amp;gt;&amp;gt;&amp;gt; MISSING = sentinel(&quot;MISSING&quot;)
&amp;gt;&amp;gt;&amp;gt; MISSING
MISSING&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Before this built-in was added, the most common sentinel idiom used the built-in &lt;code&gt;object&lt;/code&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code class=&quot;language-py&quot;&gt;MISSING = object()

def my_function(some_arg=MISSING):
    if some_arg is MISSING:
        ... # Handle the sentinel&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In the function above, the sentinel value &lt;code&gt;MISSING&lt;/code&gt; is being used to check whether the user passed &lt;em&gt;anything&lt;/em&gt; as the parameter &lt;code&gt;some_arg&lt;/code&gt; or not.
&lt;a href=&quot;https://peps.python.org/pep-0661/&quot; target=&quot;_blank&quot; rel=&quot;nofollow noopener noreferrer&quot; class=&quot;external-link no-image&quot;&gt;PEP 661&lt;/a&gt;, that introduced this built-in, has a great discussion covering the reasons as to why this pattern, and many other sentinel patterns, fall short.
In general, each common sentinel idiom suffers from at least one of the following problems:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Bad string repr&lt;/strong&gt;: the &lt;a href=&quot;https://mathspp.com/blog/pydonts/str-and-repr&quot;&gt;string representation&lt;/a&gt; is too long and uninformative&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Type unsafe&lt;/strong&gt;: the sentinels don't have a distinct type so it becomes hard or impossible to write code that uses the sentinels and is type safe&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unexpected copy behaviour&lt;/strong&gt;: the sentinels can't be copied or pickled without breaking the sentinel behaviour&lt;/li&gt;
&lt;/ol&gt;</description>
	<pubDate>Fri, 01 May 2026 17:49:00 +0000</pubDate>
</item>
<item>
	<title>Mike Driscoll: Textual-cogs 0.0.5 Released</title>
	<guid>https://blog.pythonlibrary.org/2026/05/01/textual-cogs-0-0-5-released/</guid>
	<link>https://blog.pythonlibrary.org/2026/05/01/textual-cogs-0-0-5-released/</link>
	<description>&lt;p&gt;I always thought it would be fun to create my own open source libraries or applications and distribute them somehow. &lt;span&gt;When I started writing my book, &lt;a href=&quot;https://driscollis.gumroad.com/l/textual&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Creating TUI Applications with Textual and Python&lt;/a&gt;, I took the plunge and wrote a helper package called &lt;a href=&quot;https://github.com/driscollis/textual-cogs&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;textual-cogs,&lt;/a&gt; which is a collection of reusable dialogs and widgets for Textual.&lt;/span&gt; Right now, it is mostly just dialogs, but I do hope to add some widgets to it as well.&lt;/p&gt;
&lt;p&gt;Anyway, I have released two new dialogs in the past week, with one in v0.0.4 and the other in v0.0.5.&lt;/p&gt;
&lt;h2&gt;A Textual Directory Dialog&lt;/h2&gt;
&lt;p&gt;In v0.0.5, I added a directory dialog similar to wxPython&amp;#8217;s wx.DirDialog. The dialog will display the user&amp;#8217;s directories and allow the user to choose one. It will also allow the user to create a new folder.&lt;/p&gt;
&lt;p&gt;Here&amp;#8217;s a screenshot:&lt;/p&gt;
&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-12786&quot; src=&quot;https://blog.pythonlibrary.org/wp-content/uploads/2026/05/cog_dir_dlg.png&quot; alt=&quot;Textual cogs - Directory Dialog&quot; width=&quot;739&quot; height=&quot;578&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;A Textual Open File Dialog&lt;/h2&gt;
&lt;p&gt;In v0.0.4, I also added an open file dialog. Textual cogs already has a save file dialog, and I had meant to include the open file dialog originally, but only recently got it added.&lt;/p&gt;
&lt;p&gt;Here is what that looks like:&lt;/p&gt;
&lt;p&gt;&lt;img class=&quot;aligncenter size-full wp-image-12787&quot; src=&quot;https://blog.pythonlibrary.org/wp-content/uploads/2026/05/cog_open_flle_dlg.png&quot; alt=&quot;Textual cogs - Open File Dialog&quot; width=&quot;941&quot; height=&quot;575&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;How to Install textual-cogs&lt;/h2&gt;
&lt;p&gt;You can install textual-cogs using pip or uv:&lt;/p&gt;
&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;python -m pip install textual-cogs&lt;/pre&gt;
&lt;h2&gt;Where to Get textual-cogs&lt;/h2&gt;
&lt;p&gt;You can find textual-cogs on the following websites:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/driscollis/textual-cogs&quot;&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://pypi.org/project/textual-cogs/0.0.5/&quot;&gt;Python Packaging Index&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The post &lt;a href=&quot;https://blog.pythonlibrary.org/2026/05/01/textual-cogs-0-0-5-released/&quot;&gt;Textual-cogs 0.0.5 Released&lt;/a&gt; appeared first on &lt;a href=&quot;https://blog.pythonlibrary.org&quot;&gt;Mouse Vs Python&lt;/a&gt;.&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 14:58:03 +0000</pubDate>
</item>
<item>
	<title>Real Python: The Real Python Podcast – Episode #293: Agentic Data Science Pair Programming With marimo pair</title>
	<guid>https://realpython.com/podcasts/rpp/293/</guid>
	<link>https://realpython.com/podcasts/rpp/293/</link>
	<description>&lt;p&gt;How do you add agent skills to your data science workflow? How can a coding agent assist with data wrangling and research? This week on the show, Trevor Manz from marimo joins us to discuss marimo pair.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>Real Python: Quiz: The Factory Method Pattern and Its Implementation in Python</title>
	<guid>https://realpython.com/quizzes/factory-method-python/</guid>
	<link>https://realpython.com/quizzes/factory-method-python/</link>
	<description>&lt;p&gt;In this quiz, you&amp;rsquo;ll test your understanding of
&lt;a href=&quot;https://realpython.com/factory-method-python/&quot;&gt;The Factory Method Pattern and Its Implementation in Python&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Factory Method is one of the most widely used design patterns, and it&amp;rsquo;s a powerful tool for separating object creation from object use in your code.&lt;/p&gt;
&lt;p&gt;By working through this quiz, you&amp;rsquo;ll revisit the components of the pattern, recognize opportunities to apply it, and see how you can implement a reusable, general-purpose solution in Python.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>Luke Plant: Inverse Sapir-Whorf and programming languages</title>
	<guid>https://lukeplant.me.uk/blog/posts/inverse-sapir-whorf-and-programming-languages/</guid>
	<link>https://lukeplant.me.uk/blog/posts/inverse-sapir-whorf-and-programming-languages/</link>
	<description>&lt;p&gt;The &lt;a class=&quot;reference external&quot; href=&quot;https://en.wikipedia.org/wiki/Linguistic_relativity&quot;&gt;Sapir-Whorf hypothesis&lt;/a&gt;, in its simplest form, is the idea that the language you speak influences the thoughts you think. This post is about a twist on this idea, that I’m calling “Inverse Sapir-Whorf” (for want of a better term), and how we see it in computer programming languages.&lt;/p&gt;
&lt;p&gt;Sapir-Whorf is one of those ideas that has been popularised in general culture in a rather misrepresented and &lt;a class=&quot;reference external&quot; href=&quot;https://en.wikipedia.org/wiki/Arrival_(film)&quot;&gt;exaggerated form&lt;/a&gt;. In the field of linguistics, not many people today take seriously the “strong” forms of Sapir-Whorf, such as “linguistic determinism” – the idea that a language controls your thoughts or limits what you can think, or that you even need certain languages to think certain thoughts.&lt;/p&gt;
&lt;p&gt;For example, just because a language might lack grammatical tenses, it doesn’t at all follow that the speakers will be more limited in how they think about time – there are always other ways you can express time.&lt;/p&gt;
&lt;p&gt;There is a fair amount of evidence that spoken languages can affect perception, skill and attitudes in certain areas, but it’s usually hard to demonstrate a large direct effect.&lt;/p&gt;
&lt;p&gt;Inverse Sapir-Whorf is a bit different. I haven’t been able to track down where I first came across the idea, but it goes like this: if classic Sapir-Whorf says your language limits what you can say or think, or makes it hard to say some things, inverse Sapir-Whorf says your language limits what you &lt;strong&gt;can’t&lt;/strong&gt; say, or makes it hard &lt;strong&gt;not&lt;/strong&gt; to say some things, or even hard not to think about some things. Some examples might clear things up.&lt;/p&gt;

&lt;h2&gt;Examples in natural language&lt;/h2&gt;
&lt;p&gt;There are many examples to choose from, but they are not always obvious to native speakers of a language. I’ll pick just a few.&lt;/p&gt;

&lt;h3&gt;English: temporary or permanent present tense&lt;/h3&gt;
&lt;p&gt;What’s the difference between someone saying “I’m living in London” and “I live in London”? A non-native speaker may not pick this up at all, and a native speaker may pick it up only subconsciously, but “I’m living in London” reveals that the arrangement is temporary.&lt;/p&gt;
&lt;p&gt;Now, this might not even be to do with the actual length of time you have been living there, because “temporary” is pretty relative. It might be more about how much you &lt;em&gt;like&lt;/em&gt; London. You have to choose a tense, and because you typically do so subconsciously, the language is forcing you to reveal things – either the period of time you’ve been living somewhere, or how you feel about it.&lt;/p&gt;


&lt;h3&gt;English/Turkish/French: gendered pronouns and nouns&lt;/h3&gt;
&lt;p&gt;In English, in normal speech you are going to use “he” or “she” when referring to a specific person. “Singular they” does exist, but it’s very unnatural if you are talking about a specific person of known or assumed sex.&lt;/p&gt;
&lt;p&gt;You can compare this to another language which doesn’t have gendered pronouns, such as Turkish, which just has “o” for he/she/it. The lack of gendered pronouns in Turkish doesn’t stop you from thinking or talking about a person’s sex, or produce a “less gendered society”, or anywhere close, so it would be difficult to find support for normal Sapir-Whorf here. But the inverse Sapir-Whorf is obvious – English pronouns push you to talk about it whether you want to or not. If you are trying to talk about someone you know, but do so anonymously, it can be very hard to avoid making their identification easier by revealing their sex with an inadvertent “him” or “her”.&lt;/p&gt;
&lt;p&gt;Different again is French, in which &lt;em&gt;nouns&lt;/em&gt; are gendered, which in some cases can force you to reveal information. If you translate “my friend” into French, you have to choose between “mon ami” (male friend) and “mon amie” (female friend), which are distinct, at least in written form, or “mon copain” vs “ma copine”. Possessive pronouns are also interesting – they are gendered in both English and French (his/her, son/sa), but refer to the gender of the possessor and possessee respectively, and so reveal different information.&lt;/p&gt;


&lt;h3&gt;Turkish: “mış” tense&lt;/h3&gt;
&lt;p&gt;With some simplifications, Turkish has two main past tenses: there is the normal one that is similar to “simple past” in English, and then there is the “mış” form (you can pronounce that “mish” if you want).&lt;/p&gt;
&lt;p&gt;This has &lt;a class=&quot;reference external&quot; href=&quot;https://www.turkishtextbook.com/beginner-mis-forms/&quot;&gt;various functions&lt;/a&gt;, but when describing a past event, this form is used when you have second hand or unreliable information. If someone asks you “Did Fred come to work on Monday?”, then if you saw him you would use the normal past tense “geldi” (he came), but if you only &lt;em&gt;heard&lt;/em&gt; that he came you would instead say “gelmiş” (he came, but second hand information).&lt;/p&gt;
&lt;p&gt;The interesting thing to me as a non-native speaker was the effect of having these options, in contrast to English where you can just use simple past tense without any specific indication of reliability or where the information came from. In certain circumstances, Turkish forces you to include information about your level of certainty or whether you witnessed something –  the simple past form is not neutral, because the existence of the “mış” form makes it an unnatural choice if it is not the most appropriate of the two.&lt;/p&gt;
&lt;p&gt;Interestingly, having learned to think that way, my wife and I have noticed an effect on our English. Often in Turkish the “mış” suffix would come at the end of the last word in a sentence, so now quite frequently we get to the end of an English sentence and notice that we haven’t put in any marker for “this-is-second-hand-info-I-didn’t-actually-witness-it”, and so we tack “mış” on the end.&lt;/p&gt;
&lt;p&gt;Of course, you can easily express the same thing in English, using words like “apparently” and other means, but English doesn’t &lt;strong&gt;force&lt;/strong&gt; you to specify, while Turkish pretty much does.&lt;/p&gt;


&lt;h3&gt;Comments&lt;/h3&gt;
&lt;p&gt;You often don’t notice these things until you learn another language, or attempt to teach your language to a foreigner. You kind of just understand them subconsciously. The vast majority of times you choose simple present over present continuous, for example, you won’t be &lt;strong&gt;consciously&lt;/strong&gt; thinking about what that implies.&lt;/p&gt;
&lt;p&gt;I should also note that when a language forces you express something, it might not be in the form of something &lt;em&gt;included&lt;/em&gt;, but in something &lt;em&gt;omitted&lt;/em&gt;. For example, I might say “I love cake” or “I love the cake”. In the first case, I’m talking about cake generally, in the second about a specific cake. It is the absence of the word “the” in the first case that makes it unambiguous that I’m referring to all cake, because if I’m referring to a specific cake, I &lt;strong&gt;must&lt;/strong&gt; use the word “the” or some other marker like “this”. In another language, there might not be a direct equivalent to this distinction.&lt;/p&gt;



&lt;h2&gt;Examples in programming&lt;/h2&gt;
&lt;p&gt;When it comes to programming languages, I think that the “straight” version of  Sapir-Whorf is closer to being true - in some programming languages it is simply hard to express certain concepts. For example, in a language like Python or Haskell it’s hard (though not impossible) to talk about memory allocations. We often talk about the limitations of a language in terms of “things that are hard to express” in that language. Hillel Wayne has some more discussion of this in his post &lt;a class=&quot;reference external&quot; href=&quot;https://buttondown.com/hillelwayne/archive/sapir-whorf-does-not-apply-to-programming/&quot;&gt;Sapir-Whorf does not apply to Programming Languages&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;But I want to talk more about Inverse Sapir-Whorf. What is the language forcing you to talk about, even if you don’t actually care about it?&lt;/p&gt;
&lt;p&gt;I think there are actually many, many examples of this, but seeing them can be quite hard, and often requires the “foreigner perspective” that comes from learning multiple languages.&lt;/p&gt;
&lt;p&gt;Here are a few:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Most languages force you to express the order in which computation should be done. For example, in Python:&lt;/p&gt;
&lt;div class=&quot;code&quot;&gt;&lt;pre class=&quot;code python&quot;&gt;&lt;a id=&quot;rest_code_11fb4ed97b504dcf83b998b745f49865-1&quot; name=&quot;rest_code_11fb4ed97b504dcf83b998b745f49865-1&quot; href=&quot;https://lukeplant.me.uk/blog/posts/inverse-sapir-whorf-and-programming-languages/#rest_code_11fb4ed97b504dcf83b998b745f49865-1&quot;&gt;&lt;/a&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;some_func&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;z&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Here you are saying:&lt;/p&gt;
&lt;ul class=&quot;simple&quot;&gt;
&lt;li&gt;&lt;p&gt;first compute &lt;code class=&quot;docutils literal&quot;&gt;y + 1&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;then compute &lt;code class=&quot;docutils literal&quot;&gt;z + 2&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;then pass these two values as arguments to &lt;code class=&quot;docutils literal&quot;&gt;some_func&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You might not be very conscious of specifying this ordering, but you are doing it, and in Python, there isn’t a way to express the above computation which doesn’t also specify order. Most languages are similar, although in some &lt;a class=&quot;reference external&quot; href=&quot;https://en.cppreference.com/w/c/language/eval_order.html&quot;&gt;it gets very complicated&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;A few languages are very different, however. In Haskell, in an equivalent expression like &lt;code class=&quot;docutils literal&quot;&gt;some_func (y + 1) (z + 2)&lt;/code&gt;, due to &lt;a class=&quot;reference external&quot; href=&quot;https://wiki.haskell.org/Non-strict_semantics&quot;&gt;non-strict semantics&lt;/a&gt; you are not specifying an order of evaluation at all. This enables some clever tricks, like referring to values you haven’t defined yet (see &lt;a class=&quot;reference external&quot; href=&quot;https://wiki.haskell.org/Tying_the_Knot&quot;&gt;Tying the knot&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class=&quot;reference external&quot; href=&quot;https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/&quot;&gt;Function colouring for async&lt;/a&gt; is another good example. In languages like Javascript or Python with an explicit &lt;code class=&quot;docutils literal&quot;&gt;async&lt;/code&gt; keyword, you have to talk about whether code is sync or async.&lt;/p&gt;
&lt;p&gt;In the case of “sync” functions, you do it by omission of the &lt;code class=&quot;docutils literal&quot;&gt;async&lt;/code&gt; keyword, but you are still choosing between the two options, and there is no way to write code that is ambivalent on the subject.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most languages without &lt;a class=&quot;reference external&quot; href=&quot;https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)&quot;&gt;garbage collection&lt;/a&gt; force you to talk about memory allocation and de-allocation.&lt;/p&gt;
&lt;p&gt;For languages like C, you normally do this fairly explicitly – or implicitly use stack allocation, but you’ve still got to make that choice.  In other languages, it can become more hidden, but doesn’t really go away. In Rust, for example, in gets converted into talk about lifetimes or explicit reference counting. Saying “I just don’t care about when the memory for this gets allocated or de-allocated, please deal with it” is not really one of your options.&lt;/p&gt;
&lt;p&gt;Of course, not talking about memory allocation also has a cost. In that case, the language will almost certainly need to put a lot of things on &lt;a class=&quot;reference external&quot; href=&quot;https://en.wikipedia.org/wiki/Memory_management#HEAP&quot;&gt;the heap&lt;/a&gt; and have a runtime garbage collector. However it may also have significant freedom to choose for you – Haskell will often be able to do this using &lt;a class=&quot;reference external&quot; href=&quot;https://wiki.haskell.org/Performance/Strictness&quot;&gt;strictness analysis&lt;/a&gt;, for example.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All modern languages I’m aware of force you to think about “scope”. In many cases, scope is expressed by the physical place in which you put a variable, with some additional syntax if you want something different (like &lt;a class=&quot;reference external&quot; href=&quot;https://docs.python.org/3/reference/simple_stmts.html#the-global-statement&quot;&gt;global&lt;/a&gt; or &lt;a class=&quot;reference external&quot; href=&quot;https://docs.python.org/3/reference/simple_stmts.html#the-nonlocal-statement&quot;&gt;nonlocal&lt;/a&gt; in Python). If you never want to think about scope, you probably have to drop to assembly and live with a single global address space.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Statically typed languages often force you to think about and talk about the type of every variable. This is lessened somewhat by type inference, as the “conversation” involves a more “intelligent” listener who can pick up more things from context, but it’s still there.&lt;/p&gt;
&lt;p&gt;Pure dynamically typed languages still allow you to talk about types – for example, using things like &lt;code class=&quot;docutils literal&quot;&gt;isinstance&lt;/code&gt; checks in Python, but it is more unnatural (and technically it’s a different thing anyway).&lt;/p&gt;
&lt;p&gt;In contrast to both of them, one of the attractions of &lt;a class=&quot;reference external&quot; href=&quot;https://en.wikipedia.org/wiki/Gradual_typing&quot;&gt;gradually typed languages&lt;/a&gt; is that they genuinely avoid the inverse Sapir-Whorf problem, and allow you the freedom to talk about types, or not, at your preference. I’m not sure how well this works in practice - the existing code base conventions and the linters in use always put pressure in some direction.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I suspect that many of the features of more “approachable” or “readable” programming languages could be analysed in these terms – they have a low inverse Sapir-Whorf barrier, and don’t force you to talk about things you don’t have an opinion on, and may not even understand yet.&lt;/p&gt;
&lt;p&gt;Are there more examples of this that you’ve come across? How do they affect the programming languages we use, or how we perceive them?&lt;/p&gt;


&lt;h2&gt;Links&lt;/h2&gt;
&lt;ul class=&quot;simple&quot;&gt;
&lt;li&gt;&lt;p&gt;&lt;a class=&quot;reference external&quot; href=&quot;https://lobste.rs/s/hb9tdr/inverse_sapir_whorf_programming&quot;&gt;Discussion of this post on Lobsters&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;</description>
	<pubDate>Fri, 01 May 2026 08:40:36 +0000</pubDate>
</item>
<item>
	<title>Tryton News: Tryton News May 2026</title>
	<guid>https://discuss.tryton.org/t/tryton-news-may-2026/9214</guid>
	<link>https://discuss.tryton.org/t/tryton-news-may-2026/9214</link>
	<description>&lt;div&gt;  &lt;/div&gt;
&lt;p&gt;&lt;/p&gt;&lt;div class=&quot;lightbox-wrapper&quot;&gt;&lt;a class=&quot;lightbox&quot; href=&quot;https://discuss-cdn.tryton.org/uploads/default/original/2X/b/b874a91bc0cc86db4d7c0f63c699f944529e0c83.jpeg&quot; title=&quot;Photo: Pexels, Thirdman&quot;&gt;&lt;img src=&quot;https://discuss-cdn.tryton.org/uploads/default/optimized/2X/b/b874a91bc0cc86db4d7c0f63c699f944529e0c83_2_499x500.jpeg&quot; alt=&quot;Persons using a Laptop&quot; title=&quot;Photo: Pexels, Thirdman&quot; width=&quot;499&quot; height=&quot;500&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;p&gt;&lt;/p&gt;

&lt;p&gt;During the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues - building on the changes from &lt;a href=&quot;https://discuss.tryton.org/t/tryton-release-8-0/&quot;&gt;our last LTS release 8.0&lt;/a&gt;. We added some new features which we would like to introduce to you in this newsletter.&lt;/p&gt;
&lt;p&gt;For an in depth overview of the &lt;a href=&quot;https://bugs.tryton.org/&quot;&gt;Tryton issues please take a look at our issue tracker &lt;/a&gt; or see the issues and merge requests &lt;a href=&quot;https://code.tryton.org/tryton/-/labels&quot;&gt;filtered by label&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;heading--user&quot;&gt;Changes for the User&lt;/h2&gt;
&lt;h3 id=&quot;heading--accounting&quot;&gt;Accounting, Invoicing and Payments&lt;/h3&gt;
&lt;p&gt;We now &lt;a href=&quot;https://code.tryton.org/tryton/-/commit/bdc64a6ffe6effe0e4cd1f91a3fb218097f02183&quot;&gt;updated the supported version of stripe&lt;/a&gt; from &lt;code&gt;2025-09-30.clover&lt;/code&gt; to &lt;code&gt;2026-03-25.dahlia&lt;/code&gt;.&lt;/p&gt;
&lt;h3 id=&quot;heading--stock&quot;&gt;Stock, Production and Shipments&lt;/h3&gt;
&lt;p&gt;Now we &lt;a href=&quot;https://bugs.tryton.org/13304&quot;&gt;include the time-sheet costs in the production work costs&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&quot;heading--ui&quot;&gt;User Interface&lt;/h3&gt;
&lt;p&gt;We now implemented a &lt;a href=&quot;https://bugs.tryton.org/14743&quot;&gt;fallback on the &lt;em&gt;model name&lt;/em&gt;&lt;/a&gt; when there is no &lt;code&gt;name&lt;/code&gt; parameter is given in a Tryton URL.&lt;/p&gt;
&lt;p&gt;No we &lt;a href=&quot;https://code.tryton.org/tryton/-/commit/91b7e4ff14ce26582ae8205ae80189aee21ab926&quot;&gt;support sending emails on chat messages&lt;/a&gt; and the ability to reply to them.&lt;/p&gt;
&lt;h2 id=&quot;heading--new-modules&quot;&gt;Modules&lt;/h2&gt;
&lt;p&gt;Now we move the &lt;code&gt;account_de_skr03&lt;/code&gt;, &lt;code&gt;account_es&lt;/code&gt; and &lt;code&gt;account_es_sii&lt;/code&gt; modules to the &lt;a href=&quot;https://foss.heptapod.net/tryton-community/modules&quot;&gt;external tryton-community project&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;heading--new-documentation&quot;&gt;New Documentation&lt;/h2&gt;
&lt;p&gt;We now add a new documentation for the &lt;a href=&quot;https://docs.tryton.org/latest/client-rest/index.html&quot;&gt;REST-API&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;heading--new-releases&quot;&gt;New Releases&lt;/h2&gt;
&lt;p&gt;We released bug fixes for the currently maintained &lt;a href=&quot;https://discuss.tryton.org/t/release-process/395&quot;&gt;long term support series&lt;/a&gt;&lt;br /&gt;
&lt;a href=&quot;https://code.tryton.org/tryton/-/commits/branch/8.0&quot;&gt;8.0&lt;/a&gt;, &lt;a href=&quot;https://code.tryton.org/tryton/-/commits/branch/7.0&quot;&gt;7.0&lt;/a&gt; and &lt;a href=&quot;https://code.tryton.org/tryton/-/commits/branch/6.0&quot;&gt;6.0&lt;/a&gt;, and for the penultimate series &lt;a href=&quot;https://code.tryton.org/tryton/-/commits/branch/7.8&quot;&gt;7.8&lt;/a&gt; and &lt;a href=&quot;https://code.tryton.org/tryton/-/commits/branch/7.6&quot;&gt;7.6&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;heading--sysadmin&quot;&gt;Changes for the System Administrator&lt;/h2&gt;
&lt;p&gt;Now we [add a &lt;a href=&quot;https://en.wikipedia.org/wiki/Representational_state_transfer&quot;&gt;REST&lt;/a&gt; API](&lt;a href=&quot;https://code.tryton.org/tryton/-/commit/44edc21632c653a7a0db8a0ee42a8631c6d10f31&quot;&gt;https://code.tryton.org/tryton/-/commit/44edc21632c653a7a0db8a0ee42a8631c6d10f31&lt;/a&gt;) for user applications. &lt;a href=&quot;https://docs.tryton.org/latest/client-rest/index.html&quot;&gt;For more information, have a look at its documentation&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&quot;heading--developer&quot;&gt;Changes for Implementers and Developers&lt;/h2&gt;
&lt;p&gt;We now &lt;a href=&quot;https://code.tryton.org/tryton/-/commit/d7e0dcf4ac45824cb6688ddbeb560c519eb37a34&quot;&gt;fall back to &lt;em&gt;compact syntax&lt;/em&gt; if &lt;code&gt;RelaxNG&lt;/code&gt; files are not present&lt;/a&gt;. LXML is able to load the compact syntax in case the &lt;code&gt;rnc2rng&lt;/code&gt; package is installed. This avoids the need to generate the RelaxNG files when developing.&lt;/p&gt;
&lt;p&gt;Authors: &lt;a class=&quot;mention&quot; href=&quot;https://discuss.tryton.org/u/dave&quot;&gt;@dave&lt;/a&gt; &lt;a class=&quot;mention&quot; href=&quot;https://discuss.tryton.org/u/pokoli&quot;&gt;@pokoli&lt;/a&gt; &lt;a class=&quot;mention&quot; href=&quot;https://discuss.tryton.org/u/udono&quot;&gt;@udono&lt;/a&gt;&lt;/p&gt;
            &lt;p&gt;&lt;small&gt;1 post - 1 participant&lt;/small&gt;&lt;/p&gt;
            &lt;p&gt;&lt;a href=&quot;https://discuss.tryton.org/t/tryton-news-may-2026/9214&quot;&gt;Read full topic&lt;/a&gt;&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 06:00:35 +0000</pubDate>
</item>
<item>
	<title>Python GUIs: Streamlit Buttons — Making things happen with Streamlit buttons</title>
	<guid>https://www.pythonguis.com/tutorials/streamlit-buttons/</guid>
	<link>https://www.pythonguis.com/tutorials/streamlit-buttons/</link>
	<description>&lt;p&gt;Streamlit is a popular choice for creating interactive web applications in Python. With its simple syntax and intuitive interface, developers can quickly create visually appealing dashboards.&lt;/p&gt;
&lt;p&gt;One of the great things about Streamlit is its ability to easily handle user interaction, and dynamically update the UI in response. One of the main way for users to trigger actions in UIs is through the use of buttons. In Streamlit, the &lt;code&gt;st.button()&lt;/code&gt; method creates a button that users can click to perform an action. Each button can be associated with a different action.&lt;/p&gt;
&lt;p&gt;In this tutorial we'll look at how you can use buttons to add interactivity to your Streamlit apps.&lt;/p&gt;
&lt;h2 id=&quot;creating-buttons-in-streamlit&quot;&gt;Creating Buttons in Streamlit&lt;/h2&gt;
&lt;p&gt;To create a button in Streamlit, you use the &lt;code&gt;st.button()&lt;/code&gt; function, which takes an optional label as an argument. When the button is clicked, it returns &lt;code&gt;True&lt;/code&gt;, which you can use to control subsequent actions.&lt;/p&gt;
&lt;h3&gt;Basic Button Syntax&lt;/h3&gt;
&lt;p&gt;Here's a simple example of a button in Streamlit:&lt;/p&gt;
&lt;div class=&quot;code-block&quot;&gt;
&lt;span class=&quot;code-block-language code-block-python&quot;&gt;python&lt;/span&gt;
&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;import streamlit as st

if st.button('Click Me'):
    st.write(&quot;Button clicked!&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img alt=&quot;Simple Streamlit app with a single button&quot; src=&quot;https://www.pythonguis.com/static/tutorials/streamlit/streamlit-buttons/streamlit-button.png&quot; width=&quot;1372&quot; height=&quot;786&quot; /&gt;
&lt;em&gt;Simple Streamlit app with a single button&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;st.button('Click Me')&lt;/code&gt; creates a button labeled &lt;em&gt;Click Me&lt;/em&gt;. When the button is clicked, it returns &lt;code&gt;True&lt;/code&gt; and the &lt;code&gt;if&lt;/code&gt; evaluates to &lt;em&gt;true&lt;/em&gt; running the nested code underneath -- displaying the message &quot;Button clicked!&quot;&lt;/p&gt;
&lt;p&gt;This basic structure is the foundation of working with buttons in Streamlit. Through this simple mechanism you can build quite complex interactivity.&lt;/p&gt;
&lt;h2 id=&quot;multiple-buttons-for-different-actions&quot;&gt;Multiple Buttons for Different Actions&lt;/h2&gt;
&lt;p&gt;Building on the basic button structure, you can create multiple buttons within your Streamlit app, each associated with different actions. For instance, let's create buttons that display different messages based on which is clicked.&lt;/p&gt;
&lt;div class=&quot;code-block&quot;&gt;
&lt;span class=&quot;code-block-language code-block-python&quot;&gt;python&lt;/span&gt;
&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;import streamlit as st

if st.button('Show Greeting'):
    st.write(&quot;Hello, welcome to the app!&quot;)

if st.button('Show Goodbye'):
    st.write(&quot;Goodbye! See you soon.&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img alt=&quot;Simple Streamlit app with two buttons&quot; src=&quot;https://www.pythonguis.com/static/tutorials/streamlit/streamlit-buttons/streamlit-two-buttons.png&quot; width=&quot;1325&quot; height=&quot;1008&quot; /&gt;
&lt;em&gt;Simple Streamlit app with two buttons&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Each button is wrapped in a conditional statement. When a button is pressed, the corresponding action is executed. Depending on the button pressed, different messages are displayed, providing immediate feedback to the user.&lt;/p&gt;
&lt;p&gt;This structure is versatile and can be expanded to include more buttons and actions.&lt;/p&gt;
&lt;h2 id=&quot;displaying-dynamic-content-based-on-button-clicks&quot;&gt;Displaying Dynamic Content Based on Button Clicks&lt;/h2&gt;
&lt;p&gt;Buttons can be used to display all types of content dynamically, including text, images, and charts.  For example, below is a similar example but displaying images.&lt;/p&gt;
&lt;div class=&quot;code-block&quot;&gt;
&lt;span class=&quot;code-block-language code-block-python&quot;&gt;python&lt;/span&gt;
&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;import streamlit as st

img_url_1 = &quot;https://placehold.co/150/FF0000&quot;
img_url_2 = &quot;https://placehold.co/150/8ACE00&quot;

if st.button('Show Red Image'):
    st.image(img_url_1, caption=&quot;This is a red image&quot;)

if st.button('Show Green Image'):
    st.image(img_url_2, caption=&quot;This is a green image&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img alt=&quot;Simple Streamlit app with two buttons showing images&quot; src=&quot;https://www.pythonguis.com/static/tutorials/streamlit/streamlit-buttons/streamlit-button-images.png&quot; width=&quot;955&quot; height=&quot;819&quot; /&gt;
&lt;em&gt;Simple Streamlit app with two buttons showing images&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;When the &lt;em&gt;Show Red Image&lt;/em&gt; button is pressed, a red image is displayed. The same goes for the &lt;em&gt;Show Green Image&lt;/em&gt; button. This setup allows users to switch between different images based on their preferences.&lt;/p&gt;
&lt;p&gt;Note that the state isn't persisted between each interaction. When you click on the &quot;Show Red Image&quot; the green image will disappear, and vice versa. This isn't a &lt;em&gt;toggle&lt;/em&gt; but a natural consequence of how Streamlit works: the code of the script is executed on each interaction, so only one button can be in a &quot;clicked&quot; state at any time.&lt;/p&gt;
&lt;p class=&quot;admonition admonition-tip&quot;&gt;&lt;span class=&quot;admonition-kind&quot;&gt;&lt;i class=&quot;fas fa-lightbulb&quot;&gt;&lt;/i&gt;&lt;/span&gt;  To persist state between runs of the script, you can use Streamlit's state management features. We'll cover this in a future tutorial.&lt;/p&gt;
&lt;h2 id=&quot;dynamic-forms-based-on-button-press&quot;&gt;Dynamic Forms Based on Button Press&lt;/h2&gt;
&lt;p&gt;Dynamic forms allow users to provide input in a structured way, which can vary based on user actions. This is particularly useful for collecting information without overwhelming users with multiple fields.&lt;/p&gt;
&lt;p&gt;Here's a quick example where users can input their name and age based on button presses:&lt;/p&gt;
&lt;div class=&quot;code-block&quot;&gt;
&lt;span class=&quot;code-block-language code-block-python&quot;&gt;python&lt;/span&gt;
&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;import streamlit as st

# Title
st.title(&quot;Dynamic Forms Based on Button Press&quot;)

# Name Input Field
if st.button('Enter Name'):
    name = st.text_input('What is your name?')
    if name:
        st.write(f&quot;Hello, {name}\!&quot;)

# Age Input Field

if st.button('Enter Age'):
    age = st.number_input('What is your age?', min_value=1, max_value=120)
    if age:
        st.write(f&quot;Your age is {age}.&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;The button &lt;code&gt;Enter Name&lt;/code&gt; triggers a text input field when clicked, allowing users to enter their names. The button &lt;code&gt;Enter Age&lt;/code&gt; displays a number input field for users to enter their age. The app provides immediate feedback based on user input.&lt;/p&gt;
&lt;h3&gt;Handling Form Submission&lt;/h3&gt;
&lt;p&gt;For more complex collections of inputs that you want to work together, consider using &lt;code&gt;st.form()&lt;/code&gt; to group inputs, allowing users to submit all inputs at once:&lt;/p&gt;
&lt;div class=&quot;code-block&quot;&gt;
&lt;span class=&quot;code-block-language code-block-python&quot;&gt;python&lt;/span&gt;
&lt;pre&gt;&lt;code class=&quot;python&quot;&gt;import streamlit as st

# Title
st.title(&quot;Dynamic Forms Based on Button Press&quot;)

with st.form(&quot;my_form&quot;):
    name = st.text_input('What is your name?')
    age = st.number_input('What is your age?', min_value=1, max_value=120)
    submitted = st.form_submit_button(&quot;Submit&quot;)

    if submitted:
        st.write(f&quot;Hello, {name}\! Your age is {age}.&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img alt=&quot;Streamlit form with submit button&quot; src=&quot;https://www.pythonguis.com/static/tutorials/streamlit/streamlit-buttons/streamlit-form-submit.png&quot; width=&quot;1100&quot; height=&quot;477&quot; /&gt;
&lt;em&gt;Streamlit form with submit button&lt;/em&gt;&lt;/p&gt;
&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;In this tutorial, we explored how to make things happen in Streamlit using buttons. We learned how to create multiple buttons and display dynamic content based on user interaction.&lt;/p&gt;
&lt;p&gt;Now that you have a basic understanding of buttons in Streamlit, you can add basic interaction to your Streamlit applications.&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 06:00:00 +0000</pubDate>
</item>
<item>
	<title>Antonio Cuni: Why Python Is Slow: Talking about SPy on the Behind the Commit Podcast</title>
	<guid>https://antocuni.eu/2026/05/01/why-python-is-slow-talking-about-spy-on-the-behind-the-commit-podcast/</guid>
	<link>https://antocuni.eu/2026/05/01/why-python-is-slow-talking-about-spy-on-the-behind-the-commit-podcast/</link>
	<description>&lt;h1&gt;Why Python Is Slow: Talking about SPy on the Behind the Commit Podcast&lt;/h1&gt;&lt;p&gt;During EuroPython 2025 I had the pleasure to talk to &lt;a href=&quot;https://miabajic.dev/&quot;&gt;Mia Bajić&lt;/a&gt; for her podcast &lt;a href=&quot;https://www.youtube.com/@BehindtheCommit&quot;&gt;Behind The Commit&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;In the chat we mainly talk about Python performance and how SPy tries to improve them.&lt;/p&gt;&lt;p&gt;Now the full episode is live: you can &lt;a href=&quot;https://www.youtube.com/watch?v=CV2tYMPmMWc&amp;t=708s&quot;&gt;watch it on Youtube&lt;/a&gt; or &lt;a href=&quot;https://open.spotify.com/episode/52oMn2JxF9JlwwE0tx1vjF?si=xGGoEcvUQBK2SH2XLV2EOw&quot;&gt;listen on Spotify&lt;/a&gt;&lt;/p&gt;</description>
	<pubDate>Fri, 01 May 2026 00:00:00 +0000</pubDate>
</item>
<item>
	<title>Real Python: Quiz: Using Python for Data Analysis</title>
	<guid>https://realpython.com/quizzes/python-for-data-analysis/</guid>
	<link>https://realpython.com/quizzes/python-for-data-analysis/</link>
	<description>&lt;p&gt;In this quiz, you&amp;rsquo;ll test your understanding of
&lt;a href=&quot;https://realpython.com/python-for-data-analysis/&quot;&gt;Using Python for Data Analysis&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;By working through this quiz, you&amp;rsquo;ll revisit the stages of a data analysis workflow, including cleansing raw data with pandas, spotting outliers and typos, and using regression to find relationships between variables.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Thu, 30 Apr 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>EuroPython: EuroPython 2026: Ticket Sales Now Open</title>
	<guid>https://blog.europython.eu/europython-2026-ticket-sales-now-open/</guid>
	<link>https://blog.europython.eu/europython-2026-ticket-sales-now-open/</link>
	<description>&lt;p&gt;Hey hey, folks &amp;#x1F44B;&lt;/p&gt;&lt;p&gt;Get ready for &lt;strong&gt;EuroPython 2026&lt;/strong&gt;: the conference for all things Python, Data Science, and AI!&amp;#xA0;&lt;/p&gt;&lt;p&gt;We&amp;#x2019;ve got an exciting week planned:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Tutorials (13&amp;#x2013;14 July, Mon&amp;#x2013;Tue)&amp;#xA0;&amp;#x1F6E0;&amp;#xFE0F;&lt;/li&gt;&lt;li&gt;Conference Days (15&amp;#x2013;17 July, Wed&amp;#x2013;Fri) &amp;#x1F3A4;&lt;/li&gt;&lt;li&gt;Sprint Weekend (18&amp;#x2013;19 July, Sat&amp;#x2013;Sun) &amp;#x1F680; &lt;/li&gt;&lt;/ul&gt;&lt;p&gt;We have a special keynote this year: &amp;#x141;ukasz Langa and Pablo Galindo Salgado will be recording the &lt;strong&gt;core.py&lt;/strong&gt; podcast right on the conference stage. It will feature their special guest &lt;strong&gt;Guido van Rossum&lt;/strong&gt;, the creator of Python.&lt;/p&gt;&lt;img src=&quot;https://blog.europython.eu/content/images/2026/04/ticket-sales-1.png&quot; class=&quot;kg-image&quot; alt=&quot;alt&quot; width=&quot;1500&quot; height=&quot;1500&quot; /&gt;&lt;span&gt;Ticket sales for EuroPython 2026 is now open&lt;/span&gt;&lt;p&gt;People who&amp;#x2019;ve been to EuroPython will tell you that it is more than just talks and tutorials: it&amp;amp;aposs a time when the entire community is together, regardless of experience level or background. Each conference leads to new friends being made, projects gaining new contributors, and even people securing their next job. We want you all to be a part of it &amp;#x1F49A; &lt;/p&gt;&lt;p&gt;&amp;#x1F3AB; Grab your ticket before they sell out: &lt;/p&gt;&lt;div class=&quot;kg-card kg-button-card kg-align-center&quot;&gt;&lt;a href=&quot;https://ep2026.europython.eu/tickets?ref=blog.europython.eu&quot; class=&quot;kg-btn kg-btn-accent&quot;&gt;Buy a ticket for EuroPython 2026&lt;/a&gt;&lt;/div&gt;&lt;p&gt;Can&amp;#x2019;t wait to see you all in Krak&amp;#xF3;w and hang out with the Python crowd again &amp;#x1F40D;&amp;#x1F49A;&lt;/p&gt;&lt;p&gt;Cheers,&lt;/p&gt;&lt;p&gt;The EuroPython 2026 Organisers &amp;#x2728;&lt;/p&gt;</description>
	<pubDate>Thu, 30 Apr 2026 10:00:30 +0000</pubDate>
</item>
<item>
	<title>Seth Michael Larson: The Frog for Whom the Bell Tolls</title>
	<guid>https://sethmlarson.dev/the-frog-for-whom-the-bell-tolls?utm_campaign=rss</guid>
	<link>https://sethmlarson.dev/the-frog-for-whom-the-bell-tolls?utm_campaign=rss</link>
	<description>&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-9&quot;&gt;
&lt;!-- more --&gt;
&lt;p&gt;&lt;a href=&quot;https://en.wikipedia.org/wiki/Kaeru_no_Tame_ni_Kane_wa_Naru&quot;&gt;Kaeru no Tame ni Kane wa Naru&lt;/a&gt;
(カエルの為(ため)に鐘(かね)は鳴(な)る)
is a Japanese-only Game Boy title published in 1992 by Nintendo
and developed by &lt;a href=&quot;https://en.wikipedia.org/wiki/Intelligent_Systems&quot;&gt;Intelligent Systems&lt;/a&gt;.
The title’s official English translation is “The Frog for Whom the Bell Tolls”.
For brevity, I’ll be using the title “Frog Game” in this article.&lt;/p&gt;
&lt;p&gt;After I &lt;a href=&quot;https://sethmlarson.dev/links-awakening&quot;&gt;finished Link’s Awakening&lt;/a&gt;, the Frog Game started popping up
everywhere in my digital life. The first occurrence was without my knowledge: some of the
characters in Link’s Awakening, &lt;a href=&quot;https://zelda.fandom.com/wiki/Richard_(Link%27s_Awakening)&quot;&gt;Prince Richard and his frogs&lt;/a&gt;,
are originally from the Frog Game and use the same sprites and music.&lt;/p&gt;
&lt;!-- more --&gt;
&lt;/div&gt;
&lt;div class=&quot;col-3&quot;&gt;
&lt;center&gt;
&lt;img src=&quot;https://storage.googleapis.com/sethmlarson-dev-static-assets/FrogGameCartridge.png&quot; /&gt;
&lt;br /&gt;&lt;small&gt;&lt;em&gt;Picture of my “Kaeru no Tame ni Kane wa Naru” Game Boy cartridge.&lt;/em&gt;&lt;/small&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;While researching what game to play after Link’s Awakening I watched a
&lt;a href=&quot;https://youtu.be/lbq66dNtJL4?si=kZfOxUODUeJxEwOB&amp;t=245&quot;&gt;video by AntDude&lt;/a&gt;
detailing the history of hand-held Legend
of Zelda games. The video starts by mentioning “Frog Game”
instead of the &lt;em&gt;actual&lt;/em&gt; first Zelda game on the Game Boy: Link’s Awakening.
&lt;em&gt;Very intriguing...&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;After further research I stumbled across a project by &lt;a href=&quot;https://toruzz.com&quot;&gt;Iván Delgado&lt;/a&gt; (&lt;a href=&quot;https://bsky.app/profile/toruzz.com&quot;&gt;Bluesky&lt;/a&gt;, &lt;a href=&quot;https://www.youtube.com/@toruzz&quot;&gt;YouTube&lt;/a&gt;)
to &lt;a href=&quot;https://www.youtube.com/watch?v=xNaSz-vWmDQ&quot;&gt;create a colorization patch&lt;/a&gt; for “Frog Game” that appears to &lt;a href=&quot;https://bsky.app/profile/toruzz.com/post/3m36q7kjmps23&quot;&gt;still be in progress&lt;/a&gt;.
I was already a subscriber to &lt;a href=&quot;https://toruzz.com/blog&quot;&gt;Iván’s blog&lt;/a&gt; and had previously
read their series of posts about &lt;a href=&quot;https://toruzz.com/blog/how-to-colorize-gb-games-first-steps&quot;&gt;colorizing Game Boy games&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Everything I read about the game made me want to play: the game
was affordable, short (&lt;a href=&quot;https://github.com/sethmlarson/retroachievements-play-activity&quot;&gt;7 hours to beat&lt;/a&gt;), with a light and funny narrative, and
had ties to some of my favorite games. I’ve since played Frog Game
and I recommend the game as a quick and fun “pocket-sized”
adventure.&lt;/p&gt;

&lt;h2&gt;Playing with English translations&lt;/h2&gt;

&lt;div class=&quot;row&quot;&gt;
&lt;div class=&quot;col-6&quot;&gt;
&lt;p&gt;Kaeru no Tame ni Kane wa Naru was never released
outside of Japan and despite multiple re-releases
to the 3DS eShop and now &lt;a href=&quot;https://github.com/sethmlarson/nintendo-classics/blob/main/nintendo-classics.csv#L232:~:text=Kaeru%20no%20Tame%20ni%20Kane%20wa%20Naru&quot;&gt;Nintendo Classics&lt;/a&gt;,
there is no official English translation.&lt;/p&gt;
&lt;p&gt;I can’t read
Japanese, but to experience the dialogue. Luckily for me, there is a fan-created &lt;a href=&quot;https://www.romhacking.net/translations/6517/&quot;&gt;
English translation patch&lt;/a&gt; from 2011. I would need the actual game ROM to apply the patch.&lt;/p&gt;
&lt;/div&gt;
&lt;div class=&quot;col-3&quot;&gt;
&lt;center&gt;
&lt;p&gt;
&lt;img src=&quot;https://storage.googleapis.com/sethmlarson-dev-static-assets/FrogGameJpn.PNG&quot; /&gt;
&lt;br /&gt;&lt;small&gt;&lt;em&gt;Japanese title screen for “Kaeru no Tame ni Kane wa Naru”&lt;/em&gt;&lt;/small&gt;
&lt;/p&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;div class=&quot;col-3&quot;&gt;
&lt;center&gt;
&lt;p&gt;
&lt;img src=&quot;https://storage.googleapis.com/sethmlarson-dev-static-assets/FrogGameEng.PNG&quot; /&gt;
&lt;br /&gt;&lt;small&gt;&lt;em&gt;Title screen with the English translation patch applied&lt;/em&gt;&lt;/small&gt;
&lt;/p&gt;
&lt;/center&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;I purchased the &lt;a href=&quot;https://www.pricecharting.com/game/jp-gameboy/kaeru-no-tame-ni-kane-wa-naru?q=kaeru+no+tame+ni+kane+wa+naru&quot;&gt;game cartridge for $10 on eBay&lt;/a&gt; and &lt;a href=&quot;https://sethmlarson.dev/backup-game-boy-roms-and-saves-on-ubuntu&quot;&gt;dumped the
ROM&lt;/a&gt; using GB Operator. Next I applied the English translation patch (&lt;code&gt;.ips&lt;/code&gt;)
using &lt;a href=&quot;https://www.marcrobledo.com/RomPatcher.js/&quot;&gt;ROM Patcher JS&lt;/a&gt; by &lt;a href=&quot;https://www.marcrobledo.com/&quot;&gt;Marco Bledo&lt;/a&gt;.
I loaded the resulting ROM into the &lt;a href=&quot;https://sethmlarson.dev/getting-started-with-gamesir-pocket-taco-iphone-delta-emulator&quot;&gt;Delta Emulator&lt;/a&gt; and played exclusively
on this platform (&lt;a href=&quot;https://github.com/sethmlarson/retroachievements-play-activity&quot;&gt;with RetroAchievements enabled&lt;/a&gt;).&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;&lt;strong&gt;Beware: There are minor spoilers beyond this point!&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;References&lt;/h2&gt;

&lt;p&gt;While game’s title is a reference to “&lt;a href=&quot;https://en.wikipedia.org/wiki/For_Whom_the_Bell_Tolls&quot;&gt;For Whom the Bell Tolls&lt;/a&gt;” by Ernest Hemingway,
the &lt;a href=&quot;https://www.youtube.com/watch?v=MPTMQQLD3FI&quot;&gt;game’s story definitely isn’t&lt;/a&gt;.
One of the goals of the protagonists is to repair and ring the “Spring Bell”
to break the curse on the princes and their army: transforming
them from frogs back into humans.&lt;/p&gt;

&lt;p&gt;The developers of Frog Game, &lt;a href=&quot;https://en.wikipedia.org/wiki/Intelligent_Systems&quot;&gt;Intelligent Systems&lt;/a&gt;, also developed
my favorite game of all time: “&lt;a href=&quot;https://en.wikipedia.org/wiki/Paper_Mario:_The_Thousand-Year_Door&quot;&gt;Paper Mario: The Thousand Year Door&lt;/a&gt;”.
Chapter 4 of Paper Mario is titled “For Pigs the Bell Tolls” which is another reference to Hemingway &lt;em&gt;and potentially Frog Game?&lt;/em&gt;
Chapter 4’s story in Paper Mario has the villain “Doo*liss” ringing the Creepy Steeple bell
which transforms the Twilight Town inhabitants one-by-one into pigs.&lt;/p&gt;

&lt;p&gt;Frog Game references Nintendo very directly multiple times. During your
adventure you visit “Nantendo Inc.” (&lt;em&gt;not a typo!&lt;/em&gt;) to talk to the scientists there.
One of the “products” you end up needing from Nantendo is a “Mamicon”,
likely a reference to the &lt;a href=&quot;https://en.wikipedia.org/wiki/Nintendo_Entertainment_System&quot;&gt;Nintendo Famicom&lt;/a&gt;. From just the name alone
you will &lt;em&gt;never guess&lt;/em&gt; what the Mamicom does, you’ll have to play the
game to find out!&lt;/p&gt;

&lt;p&gt;Frog Game is referenced in a few other Nintendo games beyond Link’s Awakening,
including an &lt;a href=&quot;https://supersmashbros.fandom.com/wiki/Prince_of_Sabl%C3%A9&quot;&gt;Assist Trophy&lt;/a&gt; and
Single-Player Challenge in Super Smash Bros. Mad Scienstein
from Nantendo Inc. cameos in &lt;a href=&quot;https://www.mariowiki.com/Mad_Scienstein&quot;&gt;Wario Land 3, Wario Land 4, and Dr. Mario 64&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Gameplay&lt;/h2&gt;

&lt;p&gt;The rumors about Link’s Awakening &lt;a href=&quot;https://www.youtube.com/watch?v=iaOhJbSufJI&quot;&gt;sharing an
engine&lt;/a&gt; with Frog Game likely come from using a mix
of top-down and side-scrolling platformer perspectives.
Frog Game uses the top-down perspective when exploring
the world map or different towns and then switches to
side-scrolling when in dungeons or the castle.
Folks who have dug into the
assembly of both games are &lt;a href=&quot;https://toruzz.com/blog/la-kaeru-technical-comparison/&quot;&gt;fairly sure the two games don’t share an engine&lt;/a&gt;,
meaning the rumors are unlikely to be true.
Still a fun story :)&lt;/p&gt;

&lt;p&gt;Despite appearing to be a traditional RPG
with stats like Health, Attack, Speed, and
the ability to upgrade your equipment, this game does not
play like many RPGs. There are no tactics in combat beyond being
able to run away from a battle or use an item, which for
most of the story is only to heal using Wine. Battles
proceed automatically in a cloud of dust and
will consistently resolve as either a victory or defeat.&lt;/p&gt;

&lt;p&gt;Combat and stats are used to limit progression with difficult
“boss enemies” until you’ve discovered or unlocked every
new stat upgrade in an area. Stat upgrades are given out
similar to any other item: hidden in chests or as a reward
for defeating an enemy. You can’t increase your
stats on your own using “experience points” or “leveling up”
meaning the game is in control of how strong you are.&lt;/p&gt;

&lt;p&gt;The “illusion of control” is my favorite design choice of
Frog Game, and it goes beyond just combat and items, too.
There are many points in the game where, without you even
noticing, the game has set you on a “one-way track” where your
combat ability, health, and resources are exactly managed to produce
an outcome later in the story. It’s fun trying to break the
flow and seeing how the game responds!&lt;/p&gt;

&lt;h2&gt;Factions&lt;/h2&gt;

&lt;p&gt;The universe of Frog Game has multiple kingdoms and
three factions: humans, frogs and snakes.&lt;/p&gt;

&lt;p&gt;Frogs are afraid of snakes, as snakes will actively pursue
frogs as prey, but frogs and humans are either neutral
or friendly towards each other.
The antagonist, Lord Delarin, leads the “Croakian Army”, an
army of soldiers who are friendly towards frogs but hostile
towards humans of other kingdoms and snakes.
Humans, frogs, and snakes can only converse with
members of their group and this “information asymmetry” is used
throughout the story. &lt;/p&gt;

&lt;p&gt;Prince Richard, Prince Sablé, and the Custard Kingdom army
are all “cursed” by Mandola the witch, transforming them into frogs.
Prince Sablé eventually
gains the ability to transform into a frog, snake, and human
somewhat at-will from Mandola through additional “curses”. These
curses end up being instrumental to your success, similar
to the &lt;a href=&quot;https://www.mariowiki.com/Black_chest#:~:text=However%2C%20the%20curses%20benefit%20Mario%2C%20giving%20him%20new%20paper%20abilities&quot;&gt;“curses” from Black Chests in Paper Mario&lt;/a&gt;
or &lt;a href=&quot;https://zeldawiki.wiki/wiki/Li%27l_Devil&quot;&gt;Li’l Devils from Link’s Awakening&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Story&lt;/h2&gt;

&lt;p&gt;The story of Frog Game after the initial few chapters is quite
light. You’re trying to accomplish the
main goal which is to defeat Delarin and find Princess Tiramisu,
but a lot of that happens in the background. The bulk
of the story is solving your minute-to-minute troubles
caused either by your short-sightedness or the Croakian army.
You don’t meet Delarin until the very end and despite
a few twists at the end: the Princess does not escape
her fate. At the end of the day it’s a Game Boy game, so
the expectations of the story are not high.&lt;/p&gt;
&lt;br /&gt;&lt;hr /&gt;&lt;p&gt;Thanks for keeping RSS alive! ♥&lt;/p&gt;</description>
	<pubDate>Thu, 30 Apr 2026 00:00:00 +0000</pubDate>
</item>
<item>
	<title>PyCharm: Using Bag-of-Words With PyCharm</title>
	<guid>https://blog.jetbrains.com/pycharm/2026/04/using-bag-of-words-with-pycharm/</guid>
	<link>https://blog.jetbrains.com/pycharm/2026/04/using-bag-of-words-with-pycharm/</link>
	<description>&lt;p&gt;Have you ever wondered how &lt;a href=&quot;https://www.jetbrains.com/pycharm/data-science/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;machine learning&lt;/a&gt; models actually work with text? After all, these models require numerical input, but text is, well, text.&lt;/p&gt;



&lt;p&gt;Natural language processing (NLP) offers many ways to bridge this gap, from the large language models (LLMs) that are dominating headlines today all the way back to the foundational techniques of the 1950s. Those early methods fall under what we now call the &lt;strong&gt;bag-of-words (BoW) model&lt;/strong&gt;, and despite their age, they remain remarkably effective for a wide range of language problems.&lt;/p&gt;



&lt;p&gt;In this post, we&amp;#8217;ll unpack how the bag-of-words model works, explore the techniques it uses to convert text into numerical representations, and look at where it fits relative to more modern NLP approaches. We&amp;#8217;ll also build a text classification project using BoW techniques, and see how PyCharm&amp;#8217;s specific features make the whole process faster and easier.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;What is the bag-of-words model?&lt;/h2&gt;



&lt;p&gt;The bag-of-words model is a text representation technique that converts unstructured text into numerical vectors by tracking which words appear across a corpus (a collection of texts). Rather than preserving grammar or word order, it simply represents each document as a &amp;#8220;bag&amp;#8221; of its words, recording how often each one appears. The result is a vector of counts that captures what a text is about, even if it discards how that content is expressed.&lt;/p&gt;



&lt;p&gt;This apparent limitation turns out to matter less than you might expect. For many tasks, such as text classification and sentiment analysis, the presence of certain words is often a stronger signal than their arrangement, and BoW captures that signal efficiently.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;How does bag-of-words work?&lt;/h2&gt;



&lt;p&gt;To use the bag-of-words model, we need to convert each text in a corpus into a numerical vector. Let&amp;#8217;s walk through how that works, starting with what that vector actually looks like.&lt;/p&gt;



&lt;p&gt;Take the following sentence:&lt;/p&gt;



&lt;blockquote class=&quot;wp-block-quote&quot;&gt;
&lt;p&gt;When diving into natural language processing, it is natural for beginners to feel overwhelmed by the complexity of sentiment analysis, which involves distinguishing negative from positive text. However, as you practice with libraries like NLTK or spaCy, the concepts naturally start to click.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;A vector representation of this text using the BoW model might look something like this.&lt;/p&gt;



&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&amp;#8230;&lt;/td&gt;&lt;td&gt;natural&lt;/td&gt;&lt;td&gt;naturally&lt;/td&gt;&lt;td&gt;nausea&lt;/td&gt;&lt;td&gt;near&lt;/td&gt;&lt;td&gt;neared&lt;/td&gt;&lt;td&gt;nearing&lt;/td&gt;&lt;td&gt;necessary&lt;/td&gt;&lt;td&gt;negative&lt;/td&gt;&lt;td&gt;&amp;#8230;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;/td&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;0&lt;/td&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;



&lt;p&gt;If we think of this vector as a table, you may have noticed that each column represents a word in the corpus, and the row contains a number from 0 to 2. This number is a count of how many times the word occurs in the text, as we can see below:&lt;/p&gt;



&lt;blockquote class=&quot;wp-block-quote&quot;&gt;
&lt;p&gt;When diving into &lt;strong&gt;natural&lt;/strong&gt; language processing, it is &lt;strong&gt;natural&lt;/strong&gt; for beginners to feel overwhelmed by the complexity of sentiment analysis, which involves distinguishing &lt;strong&gt;negative&lt;/strong&gt; from positive text. However, as you practice with libraries like NLTK or spaCy, the concepts &lt;strong&gt;naturally&lt;/strong&gt; start to click.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;Each column represents a word in the vocabulary; each value records how many times that word appears. Here, “natural” appears twice, while “naturally” and “negative” each appear once.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Tokenization&lt;/h3&gt;



&lt;p&gt;Before we can build this vector, we need to split our text into tokens. In BoW modeling, this is typically straightforward: We split on whitespace, so &amp;#8220;When diving into natural language processing,&amp;#8221; becomes seven tokens: &lt;code&gt;[&quot;When&quot;, &quot;diving&quot;, &quot;into&quot;, &quot;natural&quot;, &quot;language&quot;, &quot;processing&quot;, &quot;,&quot;]&lt;/code&gt;. This is considerably simpler than the tokenization used in LLMs.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Vocabulary creation&lt;/h3&gt;



&lt;p&gt;Applying tokenization across every text in the corpus produces a long list of words. Deduplicating this list gives us our vocabulary, which we can see in the set of columns in the vector above. This process does introduce some noise: &amp;#8220;Natural&amp;#8221; and &amp;#8220;natural&amp;#8221;, for instance, would be treated as two separate tokens. We&amp;#8217;ll look at preprocessing steps to address this shortly.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Encoding&lt;/h3&gt;



&lt;p&gt;With a vocabulary in hand, we create a vector for each text with one element per vocabulary word. Encoding is then the process of filling in those elements by checking each vocabulary word against the text.&lt;/p&gt;



&lt;p&gt;The simplest approach is &lt;strong&gt;binary vectorization&lt;/strong&gt;: 0 if a word is absent, 1 if present. More common, however, is &lt;strong&gt;count vectorization&lt;/strong&gt;, which records the actual number of occurrences, as we saw in the example above. Count vectorization carries more information, since it helps distinguish texts that merely mention a topic from those that focus on it heavily.&lt;/p&gt;



&lt;p&gt;One practical consequence of this approach is sparsity. If a corpus contains thousands of unique words, each vector will have thousands of elements, but any individual text will only use a small fraction of them, leaving most values at zero. This signal-to-noise issue is something we&amp;#8217;ll return to.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Advantages of the bag-of-words model&lt;/h2&gt;



&lt;p&gt;The bag-of-words model has remained a staple in NLP for good reason. Its greatest strength is its simplicity: Because text is represented as a collection of word counts, the approach is easy to understand and straightforward to implement, making it a natural baseline before reaching for more complex architectures.&lt;/p&gt;



&lt;p&gt;Beyond simplicity, BoW is computationally efficient. As you saw above, the underlying math is lightweight, which means it scales well to large text collections without demanding significant computing resources. For tasks where the presence of specific words is sufficient to capture meaning, with sentiment analysis and topic categorization being the clearest examples, it remains a highly effective tool.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Applications of bag-of-words&lt;/h2&gt;



&lt;p&gt;Like many NLP approaches, the bag-of-words model can be applied to many natural language problems. These potential applications include:&lt;/p&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Document classification&lt;/strong&gt;, where encoded documents are sorted into predefined categories. A classic example of this is automatically sorting incoming news articles into distinct categories such as sports, politics, or technology, as we’ll see in the project we do in this post.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Sentiment analysis&lt;/strong&gt;, where the presence of certain words strongly indicates the overall tone of a text, allows models to easily determine whether a piece of writing expresses a positive, negative, or neutral sentiment. If you’re interested in learning more about BoW and other approaches to sentiment analysis, you can see a &lt;a href=&quot;https://blog.jetbrains.com/pycharm/2024/12/introduction-to-sentiment-analysis-in-python/&quot;&gt;prior blog post&lt;/a&gt; I wrote on this topic.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Spam detection&lt;/strong&gt;, which relies heavily on BoW to identify and filter out unwanted emails or messages by learning to recognize the distinct, high-frequency word patterns characteristic of spam.&amp;nbsp;&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Retrieval systems&lt;/strong&gt;, where it helps to efficiently find the most relevant documents from an immense corpus based on a user’s search query.&amp;nbsp;&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Topic modeling&lt;/strong&gt;, which aims to group similar text vectors in order to discover and extract the hidden, latent topics present within a large collection of documents.&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;As you can see, the number of potential applications is broad, making bag-of-words modeling a popular first approach to natural language problems.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Why use PyCharm for NLP?&lt;/h2&gt;



&lt;p&gt;&lt;a href=&quot;https://www.jetbrains.com/pycharm/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;PyCharm&lt;/a&gt; is particularly well-suited to bag-of-words modeling because it supports the iterative, detail-oriented workflow that text processing requires. As you’ll soon see, building a reliable BoW pipeline involves multiple steps, such as cleaning text, tokenizing, vectorizing, and validating outputs, and PyCharm&amp;#8217;s code intelligence makes each of these smoother. Autocompletion, parameter hints, and quick navigation through specialized NLP libraries reduce friction when experimenting with different vectorizer settings, and help you understand how each component behaves.&lt;/p&gt;



&lt;p&gt;&lt;a href=&quot;https://www.jetbrains.com/pycharm/features/debugger.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Debugging&lt;/a&gt; and data inspection are equally important here, since small preprocessing mistakes can have an outsized effect on results. PyCharm lets you step through your code and examine intermediate states of things such as token lists and vocabulary at runtime, making it straightforward to verify that your feature extraction is working as intended. This visibility is especially useful when diagnosing issues like unexpected vocabulary sizes or missing terms.&lt;/p&gt;



&lt;p&gt;PyCharm also supports exploratory work through its excellent &lt;a href=&quot;https://www.jetbrains.com/help/pycharm/jupyter-notebook-support.html&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;Jupyter Notebook integration&lt;/a&gt; and scientific tooling. BoW modeling often involves trying different preprocessing strategies and observing their effects immediately, so the ability to run code interactively and inspect outputs inline is a genuine advantage. Combined with built-in virtual environment and package management support, this keeps experiments reproducible and well-organized.&lt;/p&gt;



&lt;p&gt;As projects grow, PyCharm&amp;#8217;s refactoring tools, project navigation, and version control integration help manage the added complexity. BoW models rarely exist in isolation, and they&amp;#8217;re often embedded in larger ML pipelines. In such contexts, PyCharm’s features for working with larger applications mean you spend less time managing code and more time improving your models.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Setting up the project&lt;/h3&gt;



&lt;p&gt;To see these components in action, let&amp;#8217;s build an actual bag-of-words project. We&amp;#8217;ll use a classic text classification dataset and the AG News dataset, and then use the model to classify news articles into one of four categories: World, Sports, Business, or Science/Technology.&lt;/p&gt;



&lt;p&gt;To get started in PyCharm, open the &lt;em&gt;Projects and Files&lt;/em&gt; tool window and select &lt;em&gt;New… &amp;gt; New Project…&lt;/em&gt;. Since this is a data science project, we can use PyCharm&amp;#8217;s built-in Jupyter project type, which sets up a sensible default structure for us.&lt;/p&gt;



&lt;p&gt;During project configuration, you&amp;#8217;ll be asked to choose a Python interpreter. By default, PyCharm uses uv and lets you select from a range of Python versions, though all major dependency management systems are supported: pip, Anaconda, Pipenv, Poetry, and Hatch. Every project is automatically created with an attached virtual environment, so your setup will be ready to go each time you reopen the project.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-1-selecting-uv-project.png&quot; alt=&quot;&quot; class=&quot;wp-image-703594&quot; /&gt;



&lt;p&gt;With the project configured, we can install our dependencies via the &lt;em&gt;Python Packages&lt;/em&gt; tool window. Simply search for a package by name, select it from the list, and install your desired version directly into the virtual environment. You can also see the same information about the package you&amp;#8217;d find on PyPI directly within the IDE. For this project, we&amp;#8217;ll need pandas and Numpy, along with datasets from Hugging Face, scikit-learn, Pytorch, and spaCy.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-2-installing-package.png&quot; alt=&quot;&quot; class=&quot;wp-image-703605&quot; /&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Implementing bag-of-words with PyCharm&lt;/h2&gt;



&lt;p&gt;There are many versions of this dataset online. We’ll be using &lt;a href=&quot;https://huggingface.co/datasets/sh0416/ag_news&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;one of the versions&lt;/a&gt; hosted on Hugging Face Hub.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Loading and preparing the data&lt;/h3&gt;



&lt;p&gt;We’ll use Hugging Face’s &lt;code&gt;datasets&lt;/code&gt; package to download this dataset.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;from datasets import load_dataset
ag_news_all = load_dataset(&quot;sh0416/ag_news&quot;)&lt;/pre&gt;



&lt;p&gt;This gives us a Hugging Face &lt;code&gt;DatasetDict&lt;/code&gt; object. If we look at it, we can see it contains a training dataset with 120,000 news articles, and a test dataset containing 7,600 articles.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_all&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;DatasetDict({
    train: Dataset({
        features: ['label', 'title', 'description'],
        num_rows: 120000
    })
    test: Dataset({
        features: ['label', 'title', 'description'],
        num_rows: 7600
    })
})&lt;/pre&gt;



&lt;p&gt;As we’ll be training a model, we also need a validation set. We’ll convert the training and test sets to pandas DataFrames, and use the &lt;code&gt;train_test_split&lt;/code&gt; method from scikit-learn to create the validation set from the training data.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;import pandas as pd
from sklearn.model_selection import train_test_split

ag_news_train = ag_news_all[&quot;train&quot;].to_pandas()
ag_news_test = ag_news_all[&quot;test&quot;].to_pandas()

ag_news_train, ag_news_val = train_test_split(
   ag_news_train,
   test_size=0.1,     
   random_state=456,   
   stratify=ag_news_train['label'] 
)

print(f&quot;Training set: {len(ag_news_train)} samples&quot;)
print(f&quot;Validation set: {len(ag_news_val)} samples&quot;)&lt;/pre&gt;



&lt;p&gt;We now have a validation set with 12,000 articles, and a training set with 108,000 articles.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;Training set: 108000 samples
Validation set: 12000 samples&lt;/pre&gt;



&lt;p&gt;For those of you new to machine learning, you might be wondering why we need all of these different datasets. The reason for this is to make sure we have a good idea that our model will generalize well and perform as expected on unseen data. The training set is the only data the model ever learns from directly. The validation set is used to monitor how the model is performing on unseen data as we make modeling decisions, such as choosing how many epochs to train for, how large to make the hidden layer, or which preprocessing steps to apply (we’ll see all of this later). This means that we look at validation performance repeatedly while building the model, and this increases the risk that our choices gradually become tuned to the quirks of that particular split. This is why we need a third set (the test set), which we keep completely locked away until we&amp;#8217;ve finished all modeling decisions and want a single, unbiased estimate of how well our model will perform on new data. Using the test set for anything other than this final evaluation would give us an overly optimistic picture of our model&amp;#8217;s real-world performance.&lt;/p&gt;



&lt;p&gt;Let’s now inspect our datasets. PyCharm Pro has a lot of built-in features that make working with DataFrames easier, a few of which we’ll see soon. In this DataFrame, we have three columns: The article title and description, the article text, and the label indicating which of the four news categories the article belongs to. You can open any of the DataFrame cells in the &lt;em&gt;Value Editor&lt;/em&gt; to see its full text, or widen the column to prevent truncation, both of which are useful for a quick visual inspection.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-3-viewing-full-text.png&quot; alt=&quot;&quot; class=&quot;wp-image-703617&quot; /&gt;



&lt;p&gt;At the top of each column, PyCharm displays column statistics, giving you an at-a-glance summary of the data. Switching from &lt;em&gt;Compact&lt;/em&gt; to &lt;em&gt;Detailed&lt;/em&gt; mode via &lt;em&gt;Show Column Statistics&lt;/em&gt; gives you rich summary statistics about each column, and saves you from writing a lot of pandas boilerplate to get it! From these statistics, we can see that our training set is evenly split across the news categories (which is very handy when training a model). We can also see that some headlines and descriptions are not unique, which may introduce noise when classifying these duplicates.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-4-column-statistics.png&quot; alt=&quot;&quot; class=&quot;wp-image-703628&quot; /&gt;



&lt;p&gt;The first step in preparing the data is basic string cleaning, which normalizes the text and reduces meaningless token variation. For instance, without cleaning, &amp;#8220;Natural&amp;#8221; and &amp;#8220;natural&amp;#8221; would be treated as two separate vocabulary entries, as we noted earlier.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;We&amp;#8217;ll apply four cleaning steps: lowercasing, punctuation removal, number removal, and whitespace normalization. There are different string cleaning steps you can apply depending on the language and use case, but for English-language texts, these tend to be very standard. Let’s go ahead and write a function to do this.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;def apply_string_cleaning(dataset: pd.Series) -&gt; pd.Series:
   patterns_to_remove = [
       r&quot;[^a-zA-Z\s]&quot;,
   ]

   cleaned = dataset.str.lower()

   for pattern in patterns_to_remove:
       cleaned = cleaned.str.replace(pattern, &quot; &quot;, regex=True)

   cleaned = cleaned.str.replace(r&quot;\s+&quot;, &quot; &quot;, regex=True).str.strip()

   return cleaned

ag_news_train[&quot;title_clean&quot;] = apply_string_cleaning(ag_news_train[&quot;title&quot;])
ag_news_train[&quot;description_clean&quot;] = apply_string_cleaning(ag_news_train[&quot;description&quot;])&lt;/pre&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-5-raw-and-cleaned-text.png&quot; alt=&quot;&quot; class=&quot;wp-image-703639&quot; /&gt;



&lt;p&gt;This mostly works, but there&amp;#8217;s one issue: The regex strips apostrophes entirely, turning contractions like &amp;#8220;you&amp;#8217;re&amp;#8221; into &amp;#8220;you re&amp;#8221; and possessives like &amp;#8220;Canada’s&amp;#8221; into &amp;#8220;Canada s&amp;#8221;. The cleanest fix is a regex that preserves apostrophes in contractions while removing possessive endings, but this is not the most enjoyable thing to write by hand.&lt;/p&gt;



&lt;p&gt;This is where PyCharm&amp;#8217;s built-in &lt;a href=&quot;https://www.jetbrains.com/pycharm/features/ai/&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;AI Assistant&lt;/a&gt; comes in. Open the chat window via the &lt;em&gt;AI Chat&lt;/em&gt; icon on the right-hand side of the IDE and enter the following prompt:&lt;/p&gt;



&lt;blockquote class=&quot;wp-block-quote&quot;&gt;
&lt;p&gt;Can you please alter the &lt;code&gt;@apply_string_cleaning&lt;/code&gt; function so that it retains apostrophes inside words when they&amp;#8217;re used for contractions (e.g., &amp;#8220;you&amp;#8217;re&amp;#8221;), but removes them when they&amp;#8217;re used for possessives (e.g., &amp;#8220;Canada’s&amp;#8221; into &amp;#8220;Canada&amp;#8221;).&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;The &lt;code&gt;@&lt;/code&gt; notation lets you reference specific files or objects in your IDE without copying and pasting code into the prompt, including Jupyter variables like datasets and functions.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-6-ai-chat.png&quot; alt=&quot;&quot; class=&quot;wp-image-703650&quot; /&gt;



&lt;p&gt;I ran this against Claude Sonnet 4.5, though JetBrains AI supports a wide range of models from OpenAI, Anthropic, Google, and xAI, as well as open models via Ollama, LM Studio, and OpenAI-compatible APIs. Below is the updated function it returned:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;def apply_string_cleaning(dataset: pd.Series) -&gt; pd.Series:
    cleaned = dataset.str.lower()
    
    # Remove possessive apostrophes (word's -&gt; word)
    # This pattern matches: letter(s) + 's + word boundary
    cleaned = cleaned.str.replace(r&quot;(\w+)'s\b&quot;, r&quot;\1&quot;, regex=True)
    
    # Remove all non-letter characters except apostrophes within words
    cleaned = cleaned.str.replace(r&quot;[^a-zA-Z'\s]&quot;, &quot; &quot;, regex=True)
    
    # Clean up any apostrophes at the start or end of words
    cleaned = cleaned.str.replace(r&quot;\s'|'\s&quot;, &quot; &quot;, regex=True)
    
    # Remove multiple spaces and trim
    cleaned = cleaned.str.replace(r&quot;\s+&quot;, &quot; &quot;, regex=True).str.strip()
    
    return cleaned

ag_news_train[&quot;title_clean&quot;] = apply_string_cleaning(ag_news_train[&quot;title&quot;])
ag_news_train[&quot;description_clean&quot;] = apply_string_cleaning(ag_news_train[&quot;description&quot;])
&lt;/pre&gt;



&lt;p&gt;We can insert this into our Jupyter notebook directly by clicking on &lt;em&gt;Insert Snippet as Jupyter Cell&lt;/em&gt; in the AI chat.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-7-insert-code-as-cell.png&quot; alt=&quot;&quot; class=&quot;wp-image-703664&quot; /&gt;



&lt;p&gt;Once we run this updated function on our raw text, we get the correct result:&lt;/p&gt;



&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;text&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;&lt;strong&gt;text_clean&lt;/strong&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Don’t stand for racism &amp;#8211; football chief&lt;/td&gt;&lt;td&gt;don&amp;#8217;t stand for racism football chief&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Canada&amp;#8217;s Barrick Gold acquires nine per cent stake in Celtic Resources (Canadian Press)&lt;/td&gt;&lt;td&gt;canada barrick gold acquires nine per cent stake in celtic resources canadian press&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;



&lt;p&gt;We can see the contraction “don’t” is correctly preserved in the first example, but the possessive “Canada’s” has been removed. We apply this to both the training and validation datasets using the same function, so that the cleaning is consistent across both splits:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_val[&quot;title_clean&quot;] = apply_string_cleaning(ag_news_val[&quot;title&quot;])
ag_news_val[&quot;description_clean&quot;] = apply_string_cleaning(ag_news_val[&quot;description&quot;])&lt;/pre&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Creating the bag-of-words model&lt;/h3&gt;



&lt;p&gt;Now that we have clean text, we need to build our vocabulary and encode it. We&amp;#8217;ll use scikit-learn&amp;#8217;s &lt;code&gt;CountVectorizer&lt;/code&gt; for this:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;from sklearn.feature_extraction.text import CountVectorizer

countVectorizerNews = CountVectorizer()
countVectorizerNews.fit(ag_news_train[&quot;text_clean&quot;])
ag_news_train_cv = countVectorizerNews.transform(ag_news_train[&quot;text_clean&quot;]).toarray()&lt;/pre&gt;



&lt;p&gt;The process has two distinct steps. First, &lt;code&gt;.fit()&lt;/code&gt; scans the training data and builds a vocabulary by identifying every unique word and assigning it a fixed index position (for example, &amp;#8220;government&amp;#8221; = column 8,901). The result is a mapping of 59,544 unique words, which you can think of as the column headers for our eventual matrix.&lt;/p&gt;



&lt;p&gt;Second, &lt;code&gt;.transform()&lt;/code&gt; uses that vocabulary to convert each headline into a numerical vector, counting how many times each vocabulary word appears and placing that count at the corresponding index position.&lt;/p&gt;



&lt;p&gt;The reason these are two separate steps is important: When we later process our validation and test data, we&amp;#8217;ll call &lt;code&gt;.transform()&lt;/code&gt; using the vocabulary learned from the training set. This ensures that all three splits share a consistent feature space. If we re-ran .fit() on the test data, we&amp;#8217;d get a different vocabulary, and the model&amp;#8217;s predictions would be meaningless.&lt;/p&gt;



&lt;p&gt;With the vectorizer fitted and our training data transformed, we can start exploring what we&amp;#8217;ve actually built. Let&amp;#8217;s first take a look at the vocabulary. &lt;code&gt;CountVectorizer&lt;/code&gt; stores it as a dictionary mapping each word to its index position, accessible via &lt;code&gt;vocabulary_&lt;/code&gt;:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;countVectorizerNews.vocabulary_&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;{'fed': 18461,
 'up': 55833,
 'with': 58324,
 'pension': 38929,
 'defaults': 13156,
 'citing': 9475,
 'failure': 18077,
 'of': 36704,
 'two': 54804,
 'big': 5269,
 'airlines': 1139,
 'to': 53531,
 'make': 31397,
 'payments': 38686,
 'their': 52947,
...}&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;len(countVectorizerNews.vocabulary_)&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;59544&lt;/pre&gt;



&lt;p&gt;This confirms that our vocabulary contains 59,544 unique words. Browsing through it, you can start to guess what kinds of terms appear frequently in the different types of news. Country names feature heavily in the “world” news category, terms like “football” and “cricket” in the “sports” news category, terms like “profit” and “losses” in the “business” news category, and company names like “Google” and “Microsoft” in the “science/technology” category.&lt;/p&gt;



&lt;p&gt;Next, let&amp;#8217;s inspect the feature matrix itself. ag_news_train_cv is a NumPy array with one row per headline and one column per vocabulary word, giving us a matrix of shape (108,000 × 59,544). We can wrap it in a DataFrame to make it easier to inspect in PyCharm&amp;#8217;s DataFrame viewer:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;pd.DataFrame(ag_news_train_cv, columns=countVectorizerNews.get_feature_names_out())&lt;/pre&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-8-sparse-matrix.png&quot; alt=&quot;&quot; class=&quot;wp-image-703675&quot; /&gt;



&lt;p&gt;As expected, the matrix is very sparse. Most values are zero, since any individual headline only contains a small fraction of the full vocabulary. In fact, you might have noticed that the number of columns is two-thirds of the number of rows, which is never good for a feature matrix. We’ll explore how to reduce the dimensionality of the feature space in a later section.&lt;/p&gt;



&lt;p&gt;Note that we also need to apply this vectorization to the validation dataset before moving on to modeling. Importantly, we are only applying the &lt;code&gt;.transform&lt;/code&gt; method to the validation set, as we already trained it on the training dataset.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_val_cv = countVectorizerNews.transform(ag_news_val[&quot;text_clean&quot;]).toarray()&lt;/pre&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Visualizing the results&lt;/h2&gt;



&lt;p&gt;Before we move onto reducing down the dimensionality of our feature space, let&amp;#8217;s explore the distribution of the words in our corpus. This can help us to understand the most common and rare words, and how we might use this to further process our data to amplify the signal-to-noise ratio.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Word frequency plots&lt;/h3&gt;



&lt;p&gt;We’ll start by creating a DataFrame that aggregates word counts across all headlines and ranks them by frequency:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;import numpy as np

vocab = countVectorizerNews.get_feature_names_out()
counts = np.asarray(ag_news_train_cv.sum(axis=0)).flatten()

pd.DataFrame({
  'vocab': vocab,
  'count': counts,
}).sort_values('count', ascending=False).reset_index(drop=True)&lt;/pre&gt;



&lt;p&gt;First, we retrieve the vocabulary in index order using &lt;code&gt;get_feature_names_out()&lt;/code&gt;, so each word lines up with its corresponding column in the feature matrix. We then sum the matrix column-wise (that is, across all documents) to get the total number of times each word appears in the training set. Finally, we wrap these two arrays into a DataFrame and sort by count, giving us a ranked list of the most frequent terms.&lt;/p&gt;



&lt;p&gt;Once this DataFrame is displayed in PyCharm, we can easily turn it into a visualization without writing a single line of code. By clicking on the &lt;em&gt;Chart View&lt;/em&gt; button in the top left-hand corner of the DataFrame, we can explore a range of ways of visualizing our data. Go to &lt;em&gt;Show Series Settings&lt;/em&gt; in the top right-hand corner, and adjust the parameters to get the count frequencies of the words: we set the &lt;em&gt;X axis&lt;/em&gt; value to “vocab” (and change &lt;em&gt;group and sort&lt;/em&gt; to &lt;em&gt;none&lt;/em&gt;), the &lt;em&gt;Y axis&lt;/em&gt; value to “count”, and the chart type to “Bar”.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-9-chart-view.png&quot; alt=&quot;&quot; class=&quot;wp-image-703686&quot; /&gt;



&lt;p&gt;Hovering over this chart, we can see that it has a very long-tailed distribution, which is very typical of vocabulary frequencies (this is actually so typical that it is described in something called Zipf’s law). This means that the majority of our words very rarely occur in the text, and in fact, if we hover over the right-hand side of the chart, it looks like around a third of our vocabulary terms are only used once!&amp;nbsp;&lt;/p&gt;



&lt;p&gt;On the other hand, when we hover over the left-hand side of the chart, we can see that this is dominated by very common words, prepositions, and articles, such as “to”, “in”, “the”, and “you”. These words don’t really carry any meaning and pretty much occur in every text, so they’re unlikely to be useful for our classification task.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;Let’s have a look at some things we can do to clean up our feature space and help our semantically meaningful words stand out a bit more.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Advanced bag-of-words techniques&lt;/h2&gt;



&lt;p&gt;The basic BoW pipeline we&amp;#8217;ve built so far is a solid foundation, but there are several techniques that can meaningfully improve its quality. This section walks through the most important ones. We’ll only be using a selection of them in our project, but you can investigate which of these seem appropriate when building your own project.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Stop word removal&lt;/h3&gt;



&lt;p&gt;Stop words are extremely common words that appear frequently across all kinds of text but carry little meaningful information. This includes words like &amp;#8220;the&amp;#8221;, &amp;#8220;is&amp;#8221;, &amp;#8220;and&amp;#8221;, &amp;#8220;of&amp;#8221;, as we saw in the histogram in the previous section. They inflate vocabulary size without adding signal, so removing them is one of the most straightforward ways to improve your BoW representation. NLTK provides a built-in stop word list for English and many other languages.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Stemming and lemmatization&lt;/h3&gt;



&lt;p&gt;Another issue you might have noticed in our vocabulary is that words that are semantically equivalent have different syntactic forms, meaning that while they should be treated as the same token, they occupy additional token slots. We can resolve this through two techniques: stemming and lemmatization. Stemming reduces words to their root form using simple rule-based truncation (e.g. &amp;#8220;running&amp;#8221; → &amp;#8220;run&amp;#8221;), while lemmatization takes a linguistic approach, mapping words to their dictionary base form. Lemmatization is slower but generally produces cleaner results, particularly for irregular word forms.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;TF-IDF&lt;/h3&gt;



&lt;p&gt;Term frequency-inverse document frequency (TF-IDF) is an extension of basic count vectorization that weights each word by how informative it actually is. A word that appears frequently in one document but rarely across the corpus receives a high weight; a word that appears everywhere receives a low one. This neatly addresses one of the core weaknesses of raw count vectors: common but uninformative words can dominate the feature space even after stop-word removal.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;N-grams&lt;/h3&gt;



&lt;p&gt;Standard BoW treats each word independently, which means it misses phrases whose meaning depends on word combinations. A classic example of this is &amp;#8220;machine learning”, which has a distinct meaning to “machine” + “learning”. N-grams address this by treating sequences of adjacent words as single tokens, so a bigram model would capture &amp;#8220;machine learning&amp;#8221; as a feature in its own right. The trade-off is a much larger vocabulary, so in practice, bigrams are most commonly used, with trigrams reserved for cases where capturing longer phrases is important.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Handling out-of-vocabulary words&lt;/h3&gt;



&lt;p&gt;When you apply your fitted vectorizer to new data, any words not present in the training vocabulary are silently ignored by default. For many tasks, this is acceptable, but if your production data is likely to continue introducing new terms that carry meaningful signal, it&amp;#8217;s worth considering alternatives. One common approach is to reserve a special &amp;lt;UNK&amp;gt; token to represent unseen words, which at least preserves the information that something unfamiliar appeared, even if its identity is unknown and multiple (perhaps unrelated) words are collapsed onto the same token.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;However, LLMs, with their more flexible approach to tokenization, tend to be a better choice if out-of-vocabulary words will be a major issue for your model once it is in production.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Dimensionality reduction&lt;/h3&gt;



&lt;p&gt;Even after stop word removal and other cleaning steps, BoW feature matrices are typically very high-dimensional and sparse. Two widely used techniques can help. Reducing to the top-N most frequent terms is the simplest approach, discarding low-frequency words that are unlikely to generalize well. For a more principled reduction, techniques like principal component analysis (PCA) or latent semantic analysis (LSA) project the feature matrix into a lower-dimensional space, compressing the representation while preserving as much of the meaningful variance as possible.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Feature selection techniques&lt;/h3&gt;



&lt;p&gt;Rather than reducing dimensionality arbitrarily, feature selection methods identify and retain only the features most relevant to your specific task. Chi-squared testing measures the statistical dependence between each term and the target label, making it well-suited to classification tasks. Mutual information takes a similar approach, scoring each feature by how much it reduces uncertainty about the class. Both methods can substantially reduce vocabulary size while preserving model performance.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Applying bag-of-words to a real-world problem&lt;/h2&gt;



&lt;p&gt;Let&amp;#8217;s now continue the example we started earlier. We&amp;#8217;re going to take the work we&amp;#8217;ve done on our AG News text classification task and take it to its completion by building a model.&lt;/p&gt;



&lt;p&gt;A common way to build a model using encoded text is neural networks, where each of the words in the vocabulary is treated as a feature, and the categories we want to predict (in our case, the news category) are the output. We&amp;#8217;ll start by building a baseline model that applies only string cleaning and encoding to the text.&lt;/p&gt;



&lt;p&gt;I had originally written this model in Keras, as part of a previous BoW project from a couple of years ago. However, that code was now out of date. In order to update it and adapt it to Pytorch, I asked JetBrains AI to do the following:&lt;/p&gt;



&lt;blockquote class=&quot;wp-block-quote&quot;&gt;
&lt;p&gt;Please update this neural network from Keras to Pytorch, making improvements to make the code as reusable as possible.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;p&gt;This gave us the following successful port of the code:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

class MulticlassClassificationModel(nn.Module):
   def __init__(self, input_size: int, hidden_layer_size: int, num_classes: int = 4):
       super(MulticlassClassificationModel, self).__init__()
       self.fc1 = nn.Linear(input_size, hidden_layer_size)
       self.relu = nn.ReLU()
       self.fc2 = nn.Linear(hidden_layer_size, num_classes)

   def forward(self, x):
       x = self.fc1(x)
       x = self.relu(x)
       x = self.fc2(x)
       return x

def train_text_classification_model(
       train_features: np.ndarray,
       train_labels: np.ndarray,
       validation_features: np.ndarray,
       validation_labels: np.ndarray,
       input_size: int,
       num_epochs: int,
       hidden_layer_size: int,
       num_classes: int = 4,
       batch_size: int = 1920,
       learning_rate: float = 0.001) -&gt; MulticlassClassificationModel:

   # Convert labels to 0-indexed (AG News has labels 1,2,3,4 -&gt; need 0,1,2,3)
   train_labels_indexed = train_labels - 1
   validation_labels_indexed = validation_labels - 1

   # Convert numpy arrays to PyTorch tensors
   X_train = torch.FloatTensor(train_features.copy())
   y_train = torch.LongTensor(train_labels_indexed.copy())
   X_val = torch.FloatTensor(validation_features.copy())
   y_val = torch.LongTensor(validation_labels_indexed.copy())

   # Create datasets and dataloaders
   train_dataset = TensorDataset(X_train, y_train)
   train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

   # Initialize model, loss function, and optimizer
   model = MulticlassClassificationModel(input_size, hidden_layer_size, num_classes)
   criterion = nn.CrossEntropyLoss()
   optimizer = optim.RMSprop(model.parameters(), lr=learning_rate)

   # Training loop
   for epoch in range(num_epochs):
       model.train()
       train_loss = 0.0
       correct_train = 0
       total_train = 0

       for batch_features, batch_labels in train_loader:
           # Forward pass
           outputs = model(batch_features)
           loss = criterion(outputs, batch_labels)

           # Backward pass and optimization
           optimizer.zero_grad()
           loss.backward()
           optimizer.step()

           # Calculate training metrics
           train_loss += loss.item()
           _, predicted = torch.max(outputs, 1)
           correct_train += (predicted == batch_labels).sum().item()
           total_train += batch_labels.size(0)

       # Validation
       model.eval()
       with torch.no_grad():
           val_outputs = model(X_val)
           val_loss = criterion(val_outputs, y_val)
           _, val_predicted = torch.max(val_outputs, 1)
           correct_val = (val_predicted == y_val).sum().item()
           total_val = y_val.size(0)

       # Print epoch metrics
       train_acc = correct_train / total_train
       val_acc = correct_val / total_val
       print(f'Epoch [{epoch+1}/{num_epochs}], '
             f'Train Loss: {train_loss/len(train_loader):.4f}, '
             f'Train Acc: {train_acc:.4f}, '
             f'Val Loss: {val_loss:.4f}, '
             f'Val Acc: {val_acc:.4f}')

   return model

def generate_predictions(model: MulticlassClassificationModel,
                       validation_features: np.ndarray,
                       validation_labels: np.ndarray) -&gt; list:
   model.eval()

   # Convert to tensors
   X_val = torch.FloatTensor(validation_features.copy())

   with torch.no_grad():
       outputs = model(X_val)
       _, predicted = torch.max(outputs, 1)

   # Convert back to 1-indexed labels to match original dataset
   predicted_labels = (predicted.numpy() + 1)

   print(&quot;Confusion Matrix:&quot;)
   print(pd.crosstab(validation_labels, predicted_labels,
                     rownames=['Actual'], colnames=['Predicted']))
   return predicted_labels.tolist()&lt;/pre&gt;



&lt;p&gt;Let’s walk through this code step-by-step to understand how we’re going to train our text classifier.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;The model architecture&lt;/h3&gt;



&lt;p&gt;&lt;code&gt;MulticlassClassificationModel&lt;/code&gt; is a simple two-layer feedforward neural network. It takes a BoW vector as input, with each feature being a vocabulary word, and passes it through two sequential transformations to produce a prediction. The first layer (&lt;code&gt;fc1&lt;/code&gt;) compresses this high-dimensional input down to a smaller intermediate representation, whose size we control via &lt;code&gt;hidden_layer_size&lt;/code&gt;. A ReLU activation is then applied, which introduces a small amount of mathematical complexity that allows the model to learn patterns that a simple weighted sum couldn&amp;#8217;t capture. The second layer (&lt;code&gt;fc2&lt;/code&gt;) takes this intermediate representation and maps it down to four output values, one per news category, where the category with the highest value becomes the model&amp;#8217;s prediction.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Training and validation&lt;/h3&gt;



&lt;p&gt;&lt;code&gt;train_text_classification_model&lt;/code&gt; handles the full training loop. It starts with a small amount of housekeeping: The AG News labels run from 1 to 4, but PyTorch expects 0-indexed classes, so these are shifted down by 1. The features and labels are then converted to PyTorch tensors, and a &lt;code&gt;DataLoader&lt;/code&gt; is created to feed the training data to the model in batches.&lt;/p&gt;



&lt;p&gt;Each epoch, the model processes the training data batch by batch. For each batch, it runs a forward pass to generate predictions, computes the cross-entropy loss against the true labels, and then runs a backward pass to update the model weights via the RMSprop optimizer. At the end of every epoch, the model switches into evaluation mode and runs inference over the full validation set, printing the training and validation loss and accuracy so we can monitor how training is progressing.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Generating predictions&lt;/h3&gt;



&lt;p&gt;Once training is complete, &lt;code&gt;generate_predictions&lt;/code&gt; runs the trained model on a held-out dataset and returns the predicted class for each article. It also prints a confusion matrix, which gives us a breakdown of which categories the model is getting right and where it&amp;#8217;s getting confused, which is a much more informative picture than accuracy alone.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Running the baseline&lt;/h3&gt;



&lt;p&gt;We can now train the baseline model. We pass in the raw count-vectorized training and validation features, specify an input size equal to the vocabulary size (59,544 columns), train for two epochs, and use a hidden layer of 5,000 nodes.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;baseline_model = train_text_classification_model(
    ag_news_train_cv,
    ag_news_train[&quot;label&quot;].to_numpy(),
    ag_news_val_cv,
    ag_news_val[&quot;label&quot;].to_numpy(),
    ag_news_train_cv.shape[1],
    5,
    5000
)

predictions = generate_predictions(
    baseline_model,
    ag_news_val_cv,
    ag_news_val[&quot;label&quot;].to_numpy()
)&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;Epoch [1/2], Train Loss: 0.3553, Train Acc: 0.8813, Val Loss: 0.2307, Val Acc: 0.9243
Epoch [2/2], Train Loss: 0.1217, Train Acc: 0.9587, Val Loss: 0.2352, Val Acc: 0.9240

Confusion Matrix:
Predicted     1     2     3     4
Actual                           
1          2774    65    89    72
2            37  2944     9    10
3           112    20  2694   174
4            97    20   207  2676&lt;/pre&gt;



&lt;p&gt;Even with the very basic data preparation we did, we can see we’ve performed very well on this prediction task, with around 92% accuracy. The confusion matrix shows that the model seems to have the easiest time distinguishing between category two (sports) and the other topics, and the hardest time distinguishing between category three (business) and category four (science/technology). This makes sense, as the words used to describe sports are very distinct and unlikely to be used in other contexts (things like football), whereas there is likely to be overlapping vocabulary between business and technology (especially company names).&lt;/p&gt;



&lt;p&gt;As we saw above, there is a lot we can do to improve the signal-to-noise ratio in BoW modeling. Let’s apply four commonly used techniques to our data and see whether this improves our predictions: lemmatization, stop word removal, limiting our vocabulary to the top N terms, and TF-IDF weighting. As you’ll see, all of these can be done relatively simply using inbuilt functions in packages such as spaCy and scikit-learn.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Lemmatization&lt;/h3&gt;



&lt;p&gt;As we discussed earlier, lemmatization collapses inflected word forms into a single vocabulary entry by mapping each word to its dictionary base form, which both shrinks the vocabulary and concentrates the signal for each concept into a single feature. We&amp;#8217;ll use spaCy for this, which first requires downloading its small English language model:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;!python -m spacy download en_core_web_sm

nlp = spacy.load(&quot;en_core_web_sm&quot;)&lt;/pre&gt;



&lt;p&gt;Our &lt;code&gt;lemmatise_text&lt;/code&gt; function passes each text through spaCy&amp;#8217;s NLP pipeline using &lt;code&gt;nlp.pipe()&lt;/code&gt;, which processes them in batches of 1,000 for efficiency. For each document, it extracts the &lt;code&gt;.lemma_&lt;/code&gt; attribute of every token and joins them back into a single string. One small detail worth noting: we preserve the original DataFrame index when constructing the output Series, so that rows stay correctly aligned when we assign the results back.&lt;/p&gt;



&lt;p&gt;We apply lemmatization before string cleaning, since spaCy needs the original casing and punctuation to correctly identify grammatical structure. For example, &amp;#8220;running&amp;#8221; and &amp;#8220;Running&amp;#8221; lemmatize to the same thing, but removing punctuation first can confuse the parser. Once lemmatized, we pass the output through &lt;code&gt;apply_string_cleaning&lt;/code&gt; as before:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_train[&quot;title_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_train[&quot;title&quot;]))
ag_news_train[&quot;description_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_train[&quot;description&quot;]))

ag_news_val[&quot;title_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_val[&quot;title&quot;]))
ag_news_val[&quot;description_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_val[&quot;description&quot;]))

ag_news_train[&quot;text_clean&quot;] = ag_news_train[&quot;title_clean&quot;] + &quot; &quot; + ag_news_train[&quot;description_clean&quot;]

ag_news_val[&quot;text_clean&quot;] = ag_news_val[&quot;title_clean&quot;] + &quot; &quot; + ag_news_val[&quot;description_clean&quot;]&lt;/pre&gt;



&lt;p&gt;We apply this separately to the title and description columns before concatenating them into a single &lt;code&gt;text_clean&lt;/code&gt; field. As you can see, we do this for both the training and validation sets using the same function, so that lemmatization is applied consistently across both splits.&lt;/p&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-10-lemmatisation.png&quot; alt=&quot;&quot; class=&quot;wp-image-703698&quot; /&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Removing stop words&lt;/h3&gt;



&lt;p&gt;As with lemmatization, we covered the motivation for stop word removal earlier: Words like &amp;#8220;the&amp;#8221;, &amp;#8220;is&amp;#8221;, and &amp;#8220;of&amp;#8221; appear so frequently across all texts that they add noise rather than signal to our feature matrix. Here we&amp;#8217;ll actually apply it to our data.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;def remove_stopwords(texts: pd.Series) -&gt; pd.Series:
   texts = texts.fillna(&quot;&quot;).astype(str)

   filtered_texts = []
   for doc in nlp.pipe(texts, batch_size=1000):
       filtered_texts.append(
           &quot; &quot;.join(token.text for token in doc if not token.is_stop)
       )

   return pd.Series(filtered_texts, index=texts.index)&lt;/pre&gt;



&lt;p&gt;Our &lt;code&gt;remove_stopwords&lt;/code&gt; function again uses &lt;code&gt;nlp.pipe()&lt;/code&gt; to process texts in batches. For each document, it filters out any token where spaCy&amp;#8217;s &lt;code&gt;is_stop&lt;/code&gt; attribute is True, and joins the remaining tokens back into a string. Conveniently, spaCy handles stop word detection using the same pipeline we already loaded for lemmatization, so no additional setup is needed.&lt;/p&gt;



&lt;p&gt;We apply this to the already-cleaned and lemmatized &lt;code&gt;text_clean&lt;/code&gt; column for both the training and validation sets, so the stop word removal builds directly on our previous preprocessing steps and is applied consistently across both splits.&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_train[&quot;text_no_stopwords&quot;] = remove_stopwords(ag_news_train[&quot;text_clean&quot;])
ag_news_val[&quot;text_no_stopwords&quot;] = remove_stopwords(ag_news_val[&quot;text_clean&quot;])&lt;/pre&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Top N terms and TF-IDF vectorization&lt;/h3&gt;



&lt;p&gt;The final two improvements we&amp;#8217;ll apply are limiting the vocabulary size and switching from raw count vectorization to TF-IDF weighting. Conveniently, scikit-learn&amp;#8217;s &lt;code&gt;TfidfVectorizer&lt;/code&gt; handles both in a single step.&lt;/p&gt;



&lt;p&gt;Recall from earlier that TF-IDF downweights words that appear frequently across many documents while upweighting words that are distinctive to particular documents. This cleans up uninformative words that don’t quite qualify as stopwords, but add little useful information to our dataset. The &lt;code&gt;max_features=20000&lt;/code&gt; argument caps the vocabulary at the 20,000 most frequent terms after TF-IDF scoring, which discards the long tail of rare words that are unlikely to generalize well and brings our feature matrix down to a much more manageable size. (The choice of 20,000 words is arbitrary. We could have easily used a smaller or larger number, depending on our dataset and use case.)&lt;/p&gt;



&lt;p&gt;As with &lt;code&gt;CountVectorizer&lt;/code&gt;, we fit only on the training data and then use that fixed vocabulary to transform both the training and validation sets:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;TfidfVectorizerNews = TfidfVectorizer(max_features=20000)
TfidfVectorizerNews.fit(ag_news_train[&quot;text_no_stopwords&quot;])

ag_news_train_tfidf = TfidfVectorizerNews.transform(ag_news_train[&quot;text_no_stopwords&quot;]).toarray()
ag_news_val_tfidf = TfidfVectorizerNews.transform(ag_news_val[&quot;text_no_stopwords&quot;]).toarray()&lt;/pre&gt;



&lt;p&gt;We can inspect the resulting vocabulary and feature matrix exactly as we did before:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;TfidfVectorizerNews.vocabulary_&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;{'fed': np.int64(6243),
 'pension': np.int64(13134),
 'default': np.int64(4469),
 'cite': np.int64(3200),
 'failure': np.int64(6109),
 'big': np.int64(1787),
 'airline': np.int64(401),
 'payment': np.int64(13051),
 'plan': np.int64(13424),
 'government': np.int64(7306),
 'official': np.int64(12453),
 'tuesday': np.int64(18437),
 'congress': np.int64(3691),
 'hard': np.int64(7689),
 'corporation': np.int64(3901),
...}&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;pd.DataFrame(ag_news_train_tfidf, columns=TfidfVectorizerNews.get_feature_names_out())&lt;/pre&gt;



&lt;img src=&quot;https://blog.jetbrains.com/wp-content/uploads/2026/04/screenshot-12-tf-idf-matrix.png&quot; alt=&quot;&quot; class=&quot;wp-image-703905&quot; /&gt;



&lt;p&gt;Compared to our baseline feature matrix of 59,544 columns filled almost entirely with zeros, this is considerably leaner. We now have 20,000 columns of weighted scores that better reflect each word&amp;#8217;s actual importance to the document it appears in. It is still relatively sparse, but we can see from both the feature matrix and the vocabulary list that it is much more focused on semantically rich words.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Fitting the revised model&lt;/h3&gt;



&lt;p&gt;With our improved features in hand, we can now retrain the model. The call is identical to before, except we pass in the TF-IDF feature matrices instead of the raw count vectors, and the input size is now 20,000 rather than 59,544:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;baseline_model = train_text_classification_model(
    ag_news_train_tfidf,
    ag_news_train[&quot;label&quot;].to_numpy(),
    ag_news_val_tfidf,
    ag_news_val[&quot;label&quot;].to_numpy(),
    ag_news_train_tfidf.shape[1],
    2,
    5000
)

predictions = generate_predictions(
    baseline_model,
    ag_news_val_tfidf,
    ag_news_val[&quot;label&quot;].to_numpy()
)&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;Epoch [1/2], Train Loss: 0.3183, Train Acc: 0.8932, Val Loss: 0.2301, Val Acc: 0.9225
Epoch [2/2], Train Loss: 0.1512, Train Acc: 0.9475, Val Loss: 0.2332, Val Acc: 0.9243
Confusion Matrix - Raw Counts:
Predicted     1     2     3     4
Actual                           
1          2703    71   121   105
2            20  2955    13    12
3            68    19  2691   222
4            77    17   163  2743&lt;/pre&gt;



&lt;p&gt;The results are actually very encouraging! Our overall validation accuracy is essentially unchanged at around 92%, but we&amp;#8217;ve achieved this with a feature matrix that is less than a third of the size. This suggests that the extra vocabulary in the baseline (including the stop words) was contributing to noise rather than signal. Reducing the size of the feature matrix makes our model more stable, less prone to overfitting, and much more manageable to deploy.&lt;/p&gt;



&lt;p&gt;Looking at the confusion matrix, the pattern of errors is similar to before: Sports (category two) is the easiest category to classify, with 98.5% accuracy, while Business (category three) and Science/Technology (category four) remain the hardest to separate, with around 7% of articles in each category being misclassified as the other. This is consistent with what we saw in the baseline, so it seems that the preprocessing improvements have tightened things up at the margins, but the fundamental difficulty of the Business/Technology boundary is a property of the data rather than the feature representation.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Applying our model to the test set&lt;/h3&gt;



&lt;p&gt;Finally, we need to validate that our model performs as well on the test set as it does on the validation set. Up to this point, we&amp;#8217;ve deliberately kept the test set locked away. As mentioned earlier, if we had been making modeling decisions based on test set performance, we&amp;#8217;d risk inadvertently overfitting our choices to it, and our final accuracy estimate would be optimistic.&lt;/p&gt;



&lt;p&gt;The preprocessing steps must be applied in exactly the same order as for the training and validation data: lemmatization, string cleaning, concatenation of title and description, and stop-word removal. Crucially, we also call &lt;code&gt;.transform()&lt;/code&gt; rather than &lt;code&gt;.fit_transform()&lt;/code&gt; on the test text, using the vocabulary learned from the training data:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;ag_news_test[&quot;title_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_test[&quot;title&quot;]))
ag_news_test[&quot;description_clean&quot;] = apply_string_cleaning(lemmatise_text(ag_news_test[&quot;description&quot;]))
ag_news_test[&quot;text_clean&quot;] = ag_news_test[&quot;title_clean&quot;] + &quot; &quot; + ag_news_test[&quot;description_clean&quot;]
ag_news_test[&quot;text_no_stopwords&quot;] = remove_stopwords(ag_news_test[&quot;text_clean&quot;])

ag_news_test_tfidf = TfidfVectorizerNews.transform(ag_news_test[&quot;text_no_stopwords&quot;]).toarray()&lt;/pre&gt;



&lt;p&gt;We can then generate predictions and evaluate accuracy on the test set:&lt;/p&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;test_predictions = generate_predictions(
    baseline_model,
    ag_news_test_tfidf,
    ag_news_test[&quot;label&quot;].to_numpy()
)

test_accuracy = accuracy_score(ag_news_test[&quot;label&quot;].to_numpy(), test_predictions)
print(f&quot;Test Accuracy: {test_accuracy:.4f}&quot;)&lt;/pre&gt;



&lt;pre class=&quot;EnlighterJSRAW&quot;&gt;Test Accuracy: 0.9183

Confusion Matrix - Raw Counts:
Predicted     1     2     3     4
Actual                           
1          1710    54    78    58
2            13  1870    10     7
3            51    12  1676   161
4            53     9   115  1723&lt;/pre&gt;



&lt;p&gt;The test accuracy of 91.8% is very close to the 92.4% we saw on the validation set, which is a reassuring sign that our model has generalized well rather than overfitting to the validation data. The confusion matrix tells the same story as before: Sports (category two) remains the easiest category to classify, with only 30 misclassified articles out of 1,900, while the Business/Technology boundary continues to be the main source of errors, with around 8% of articles in each category being misclassified as the other. The consistency between validation and test results gives us confidence that these error patterns reflect genuine properties of the data rather than artifacts of any particular split.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Limitations and alternatives&lt;/h2&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Loses word order information&lt;/h3&gt;



&lt;p&gt;The most fundamental limitation of the bag-of-words model is right there in the name: it treats text as an unordered collection of words, discarding all sequence information. This means &amp;#8220;the dog bit the man&amp;#8221; and &amp;#8220;the man bit the dog&amp;#8221; produce identical vectors, even though they describe very different events. For many classification tasks, this doesn&amp;#8217;t matter much, but for tasks that require understanding the relationship between words, such as question answering or natural language inference, the loss of word order is a serious handicap.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Ignores semantics and context&lt;/h3&gt;



&lt;p&gt;BoW has no notion of word meaning or context. Each word is simply a column in a matrix, entirely independent of every other word. This creates two related problems. First, synonyms are treated as completely distinct features: &amp;#8220;cheap&amp;#8221; and &amp;#8220;inexpensive&amp;#8221; contribute nothing to each other&amp;#8217;s signal, even though they mean the same thing. Second, words with multiple meanings are treated as a single feature regardless of context: &amp;#8220;bank&amp;#8221; means the same thing whether it appears in a sentence about rivers or finance. Both of these issues limit how well BoW representations can capture the actual semantics of a text.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Can result in large, sparse vectors&lt;/h3&gt;



&lt;p&gt;As we saw in our own example, even a moderately sized corpus of news headlines can produce a vocabulary of nearly 60,000 unique terms. The resulting feature matrix has one column per vocabulary word, but any individual document only uses a tiny fraction of them, leaving the vast majority of values at zero. This sparsity creates two practical problems: The matrices can consume a large amount of memory if stored densely, and the high dimensionality can make it harder for models to find meaningful patterns, a phenomenon sometimes called the curse of dimensionality.&lt;/p&gt;



&lt;h3 class=&quot;wp-block-heading&quot;&gt;Alternatives&lt;/h3&gt;



&lt;p&gt;If BoW&amp;#8217;s limitations are a bottleneck for your task, there are several well-established alternatives worth considering.&lt;/p&gt;



&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Word embeddings (Word2Vec and GloVe)&lt;/strong&gt; address the semantics problem by representing each word as a dense vector in a continuous space, where similar words are geometrically close to each other. They capture distributional meaning far more richly than BoW, and are a natural next step when synonym handling or word similarity matters. Doc2Vec extends this idea to produce embeddings for entire documents rather than individual words.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Transformer-based models (BERT and GPT)&lt;/strong&gt; go further still, generating contextual representations where the same word receives a different vector depending on the surrounding text. This handles polysemy directly and captures complex long-range dependencies between words. The trade-off is substantially higher computational cost and complexity compared to BoW.&lt;/li&gt;



&lt;li&gt;&lt;strong&gt;Topic models like latent Dirichlet allocation (LDA)&lt;/strong&gt; take a different angle entirely. Rather than encoding documents for downstream classification, they are generative models that discover latent thematic structure in a corpus. This is useful when your goal is exploration and interpretation rather than prediction.&lt;/li&gt;
&lt;/ul&gt;



&lt;p&gt;For tasks where BoW already performs well, as we saw here with AG News, the added complexity of these approaches may not be worth the cost. BoW remains a strong baseline, and it&amp;#8217;s always worth establishing how far it can take you before reaching for heavier machinery.&lt;/p&gt;



&lt;h2 class=&quot;wp-block-heading&quot;&gt;Get started with PyCharm today&lt;/h2&gt;



&lt;p&gt;In this post, we&amp;#8217;ve covered a lot of ground: from the fundamentals of the bag-of-words model and how it converts text into numerical vectors, through to building and iteratively improving a real text classification pipeline on the AG News dataset. Along the way, we&amp;#8217;ve seen how preprocessing steps like lemmatization, stop word removal, vocabulary capping, and TF-IDF weighting can meaningfully improve the efficiency of your feature representation, and how PyCharm&amp;#8217;s DataFrame viewer, column statistics, chart view, and AI Assistant make each of these steps faster and easier to inspect and debug.&lt;/p&gt;



&lt;p&gt;If you&amp;#8217;d like to try this yourself, &lt;a href=&quot;https://www.jetbrains.com/pycharm/download/?section=windows&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;PyCharm Pro&lt;/a&gt; comes with a 30-day trial. As we saw in this tutorial, its built-in support for Jupyter notebooks, virtual environments, and scientific libraries means you can go from a blank project to a working NLP pipeline with minimal setup friction, leaving you free to focus on the fun parts.&amp;nbsp;&lt;/p&gt;



&lt;p&gt;You can find the &lt;a href=&quot;https://github.com/t-redactyl/ag-news-bag-of-words-classification&quot; target=&quot;_blank&quot; rel=&quot;noopener&quot;&gt;full code&lt;/a&gt; for this project on GitHub. If you&amp;#8217;re interested in exploring more NLP topics, check out our recent blogs &lt;a href=&quot;https://blog.jetbrains.com/pycharm/&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</description>
	<pubDate>Wed, 29 Apr 2026 17:42:41 +0000</pubDate>
</item>
<item>
	<title>PyCon: PyCon US 2026: Call for Volunteers</title>
	<guid>https://pycon.blogspot.com/2026/04/pycon-us-2026-call-for-volunteers.html</guid>
	<link>https://pycon.blogspot.com/2026/04/pycon-us-2026-call-for-volunteers.html</link>
	<description>&lt;p&gt;Looking to make a meaningful contribution to the Python community? Look no further than PyCon US 2026! Whether you're a seasoned Python pro or a newcomer to the community and looking to get involved, there's a volunteer opportunity that's perfect for you.&amp;nbsp;&lt;/p&gt;&lt;p&gt;Sign-up for volunteer roles is done directly through the &lt;a href=&quot;http://us.pycon.org/2026/&quot;&gt;PyCon US website&lt;/a&gt;. This way, you can view and manage shifts you sign up for through &lt;a href=&quot;https://us.pycon.org/2026/accounts/dashboard/&quot;&gt;your personal dashboard&lt;/a&gt;! You can read up on &lt;a href=&quot;https://us.pycon.org/2026/volunteer/volunteering/&quot;&gt;the different roles to volunteer for and how to sign up on the PyCon US website&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;PyCon US is largely organized and run by volunteers. Every year, we ask to fill over 300 onsite volunteer hours to ensure everything runs smoothly at the event. And the best part? You don't need to commit a lot of time to make a difference–some shifts are as short as 45 minutes long! You can sign up for as many or as few shifts as you’d like. Even a couple of hours of your time can go a long way in helping us create an amazing experience for attendees.&lt;/p&gt;&lt;p&gt;Keep in mind that you need to be &lt;a href=&quot;https://us.pycon.org/2026/attend/information/&quot;&gt;registered for the conference&lt;/a&gt; to sign up for a volunteer role.&lt;/p&gt;&lt;div&gt;One important way to get involved is to sign up as a &lt;a href=&quot;http://us/pycon.org/2026/volunteers/volunteering/session-staff/&quot;&gt;Session Chair or Session Runner&lt;/a&gt;. This is an excellent opportunity to meet and interact with speakers while helping to ensure that sessions run smoothly. And who knows, you might just learn something new along the way:) If you’re looking for an important yet simple-to-learn role, you may be just the person we’ve been looking for!&amp;nbsp;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We do ask if you sign up for these roles that you please do your absolute best to avoid canceling or worst case not showing up, so that we can make sure we have coverage for all the necessary time slots. You can sign up for these roles directly on the &lt;a href=&quot;https://us.pycon.org/2026/schedule/talks/&quot;&gt;Talks schedule&lt;/a&gt;: Sign up for an open time slot by clicking the [+ Volunteer] in one of the talk slots for the session of your choice.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;Volunteer your time at PyCon US 2026 and you’ll be part of a fantastic community that's passionate about Python programming. You can help us make this year's conference a huge success while connecting with your fellow event attendees. It’s especially great for first-timers looking to get the most out of PyCon US. Sign up today for the shifts that call to you and join the fun!&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;</description>
	<pubDate>Wed, 29 Apr 2026 14:00:59 +0000</pubDate>
</item>
<item>
	<title>Real Python: AI Coding Agents Guide: A Map of the Four Workflow Types</title>
	<guid>https://realpython.com/ai-coding-agents-guide/</guid>
	<link>https://realpython.com/ai-coding-agents-guide/</link>
	<description>&lt;div&gt;&lt;p&gt;AI coding agents can read your code, reason about changes, and act on your behalf. To choose the right one, it helps to understand the four common workflow types: integrated development environment (IDE), terminal, pull request (PR), and cloud.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In this tutorial, you’ll&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Identify&lt;/strong&gt; the four common &lt;strong&gt;agent interaction modes&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Understand&lt;/strong&gt; what makes &lt;strong&gt;each workflow distinct&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Recognize&lt;/strong&gt; which mode fits &lt;strong&gt;common development scenarios&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Weigh&lt;/strong&gt; the &lt;strong&gt;risks and tradeoffs&lt;/strong&gt; of each workflow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Before exploring the four workflow types, it’s worth looking at what makes a coding tool &lt;a href=&quot;https://realpython.com/ref/ai-coding-glossary/agentic-coding/&quot; class=&quot;ref-link&quot;&gt;agentic&lt;/a&gt; in the first place.&lt;/p&gt;
&lt;div class=&quot;container border rounded text-wrap-pretty my-3&quot;&gt;

  &lt;p class=&quot;my-3&quot;&gt;&lt;strong&gt;&lt;span class=&quot;icon baseline&quot;&gt;&lt;/span&gt; Take the Quiz:&lt;/strong&gt; Test your knowledge with our interactive “AI Coding Agents Guide: A Map of the Four Workflow Types” quiz. You’ll receive a score upon completion to help you track your learning progress:&lt;/p&gt;

  &lt;hr /&gt;

  &lt;div class=&quot;row my-3&quot;&gt;
    &lt;div class=&quot;col-xs-12 col-sm-4 col-md-3 align-self-center&quot;&gt;

      &lt;a href=&quot;https://realpython.com/quizzes/ai-coding-agents-guide/&quot; tabindex=&quot;-1&quot;&gt;
        &lt;div class=&quot;embed-responsive embed-responsive-16by9&quot;&gt;

            &lt;img class=&quot;card-img-top m-0 p-0 embed-responsive-item rounded&quot; alt=&quot;A person in overalls pointing at a four-piece puzzle map labeled IDE, Cloud, CLI, and PR/Repo, with a map info legend beside it and a Python logo.&quot; src=&quot;https://files.realpython.com/media/A-Practical-Map-of-Types-of-AI-Coding-Agents_Watermarked.7955cfd5f864.jpg&quot; width=&quot;1920&quot; height=&quot;1080&quot; /&gt;


          &lt;div class=&quot;card-img-overlay d-flex align-items-center&quot;&gt;
            &lt;div class=&quot;mx-auto&quot;&gt;
              &lt;span class=&quot;text-light&quot;&gt;&lt;span class=&quot;icon baseline scale2x&quot;&gt;&lt;/span&gt;&lt;/span&gt;
            &lt;/div&gt;
          &lt;/div&gt;
        &lt;/div&gt;
      &lt;/a&gt;

    &lt;/div&gt;
    &lt;div class=&quot;col&quot;&gt;
      &lt;div class=&quot;mt-3 d-md-none&quot;&gt;&lt;/div&gt; 
      &lt;p class=&quot;small text-muted mb-0&quot;&gt;&lt;strong&gt;Interactive Quiz&lt;/strong&gt;&lt;/p&gt;
      &lt;a href=&quot;https://realpython.com/quizzes/ai-coding-agents-guide/&quot; class=&quot;stretched-link&quot;&gt;&lt;span class=&quot;my-0 h4&quot;&gt;AI Coding Agents Guide: A Map of the Four Workflow Types&lt;/span&gt;&lt;/a&gt; 
      &lt;p class=&quot;text-muted mb-0 small&quot;&gt;Check your understanding of how AI coding agents fit into your workflow through four interaction modes: IDE, terminal, pull request, and cloud.&lt;/p&gt;
    &lt;/div&gt;
  &lt;/div&gt;

&lt;/div&gt;

&lt;div class=&quot;alert alert-warning&quot;&gt;
&lt;p&gt;&lt;strong&gt;Get Your Cheat Sheet:&lt;/strong&gt; &lt;a href=&quot;https://realpython.com/bonus/ai-coding-agents-guide-cheatsheet/&quot; class=&quot;alert-link&quot;&gt;Click here to download your free AI coding agents cheat sheet&lt;/a&gt; and keep the four workflow types at your fingertips when choosing the right agent for the job.&lt;/p&gt;
&lt;/div&gt;
&lt;h2 id=&quot;understanding-ai-coding-agents&quot;&gt;Understanding AI Coding Agents&lt;a class=&quot;headerlink&quot; href=&quot;https://realpython.com/atom.xml#understanding-ai-coding-agents&quot; title=&quot;Permanent link&quot;&gt;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;While standard chatbots provide one-off answers, coding agents are designed for autonomy, operating through a continuous execution loop to solve complex tasks. This loop typically follows four distinct steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Read&lt;/strong&gt;: They read relevant files from your codebase to form their context.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reason&lt;/strong&gt;: They determine the logical steps needed to achieve your goal.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Act&lt;/strong&gt;: They execute those steps by editing files, running terminal commands, or using external tools.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Evaluate&lt;/strong&gt;: They check the results of their actions to see if more work is needed.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This loop repeats until the task is completed or the agent hands control back to you. Unlike simple predictive text or one-off prompts, agents bridge the gap between suggestion and execution by autonomously navigating the development workflow.&lt;/p&gt;
&lt;p&gt;The core agent loop will generally stay the same, but where an agent runs will shape how you interact with it:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;In an editor&lt;/strong&gt;, it works alongside you.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;In a terminal&lt;/strong&gt;, you guide it step by step.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;In pull requests&lt;/strong&gt;, it reviews changes asynchronously.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;In the cloud&lt;/strong&gt;, it works in a managed environment and reports back later.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These environments define four primary agent types, each enabling a distinct workflow: &lt;strong&gt;IDE agents&lt;/strong&gt;, &lt;strong&gt;terminal agents&lt;/strong&gt;, &lt;strong&gt;PR agents&lt;/strong&gt;, and &lt;strong&gt;cloud agents&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&quot;exploring-the-four-workflow-types&quot;&gt;Exploring the Four Workflow Types&lt;a class=&quot;headerlink&quot; href=&quot;https://realpython.com/atom.xml#exploring-the-four-workflow-types&quot; title=&quot;Permanent link&quot;&gt;&lt;/a&gt;&lt;/h2&gt;
&lt;p&gt;The four workflow types describe interaction modes and don’t always map cleanly to product categories. The same tool often spans multiple workflows. For example, &lt;a href=&quot;https://realpython.com/ref/ai-coding-tools/claude-code/&quot; class=&quot;ref-link&quot;&gt;Claude Code&lt;/a&gt; runs in your &lt;a href=&quot;https://code.claude.com/docs/en/overview#terminal&quot;&gt;terminal&lt;/a&gt;, in your &lt;a href=&quot;https://code.claude.com/docs/en/overview#vs-code&quot;&gt;editor&lt;/a&gt;, and in the cloud with &lt;a href=&quot;https://code.claude.com/docs/en/claude-code-on-the-web&quot;&gt;Claude Code on the web&lt;/a&gt;. It can also review pull requests with &lt;a href=&quot;https://code.claude.com/docs/en/code-review&quot;&gt;Code Review&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The goal is to match the workflow to the task. The diagram below summarizes the four types at a glance:&lt;/p&gt;
&lt;a href=&quot;https://files.realpython.com/media/Autonomous_Agent-2026-04-16-135819_2.0effa4a51d4b.jpeg&quot; target=&quot;_blank&quot;&gt;&lt;img class=&quot;img-fluid mx-auto d-block &quot; src=&quot;https://files.realpython.com/media/Autonomous_Agent-2026-04-16-135819_2.0effa4a51d4b.jpeg&quot; width=&quot;2943&quot; height=&quot;1000&quot; alt=&quot;AI Agent Workflow Type Table&quot; /&gt;&lt;/a&gt;The Four Coding Agent Workflows

&lt;/div&gt;&lt;h2&gt;&lt;a href=&quot;https://realpython.com/ai-coding-agents-guide/?utm_source=realpython&amp;utm_medium=rss&quot;&gt;Read the full article at https://realpython.com/ai-coding-agents-guide/ »&lt;/a&gt;&lt;/h2&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Wed, 29 Apr 2026 14:00:00 +0000</pubDate>
</item>
<item>
	<title>Real Python: Quiz: ChatterBot: Build a Chatbot With Python</title>
	<guid>https://realpython.com/quizzes/build-a-chatbot-python-chatterbot/</guid>
	<link>https://realpython.com/quizzes/build-a-chatbot-python-chatterbot/</link>
	<description>&lt;p&gt;In this quiz, you&amp;rsquo;ll test your understanding of
&lt;a href=&quot;https://realpython.com/build-a-chatbot-python-chatterbot/&quot;&gt;ChatterBot: Build a Chatbot With Python&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You&amp;rsquo;ll revisit how ChatterBot learns from conversation data, how it picks replies based on similarity to what it&amp;rsquo;s already seen, and how it can pull in a local LLM to round out its responses.&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Wed, 29 Apr 2026 12:00:00 +0000</pubDate>
</item>
<item>
	<title>Real Python: Quiz: Python 3.13: A Modern REPL</title>
	<guid>https://realpython.com/quizzes/python313-repl/</guid>
	<link>https://realpython.com/quizzes/python313-repl/</link>
	<description>&lt;p&gt;Test your knowledge of the redesigned interactive interpreter introduced in
&lt;a href=&quot;https://realpython.com/python313-repl/&quot;&gt;Python 3.13: A Modern REPL&lt;/a&gt;,
including the help system, multiline statement editing, code pasting
improvements, and the history browser.&lt;/p&gt;
&lt;p&gt;Good luck!&lt;/p&gt;
        &lt;hr /&gt;
        &lt;p&gt;&lt;em&gt;[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short &amp;amp; sweet Python Trick delivered to your inbox every couple of days. &lt;a href=&quot;https://realpython.com/python-tricks/?utm_source=realpython&amp;utm_medium=rss&amp;utm_campaign=footer&quot;&gt;&amp;gt;&amp;gt; Click here to learn more and see examples&lt;/a&gt; ]&lt;/em&gt;&lt;/p&gt;</description>
	<pubDate>Wed, 29 Apr 2026 12:00:00 +0000</pubDate>
</item>

</channel>
</rss>
