Jason L Causey

Hello, I’m Jason. I am a researcher, teacher, programmer, sci-fi nerd, outdoorsy type, and extreme chile pepper enthusiast.

Python PDM First Look

I have been using Poetry to manage my Python virtual environments for a while now, and I’ve (mostly) been happy OK with it. But the grass is always greener, as they say, and Poetry doesn’t always leave me feeling all warm and fuzzy.

Here is an example that currently makes me grumble at Poetry:

First, let’s try to set up a brand new project that needs Numpy and Tensorflow. Simple, right?

$ mkdir poetry-example
$ cd poetry-example/
$ poetry init -n

[... command output omitted here ...]

$ time poetry add numpy tensorflow
Creating virtualenv poetry-example in /private/tmp/poetry-example/.venv
Using version ^1.22.3 for numpy
Using version ^2.8.0 for tensorflow

Updating dependencies

Resolving dependencies... (0.0s)
Resolving dependencies... (0.1s)

  SolverProblemError

  The current project's Python requirement (>=3.9,<4.0) is not compatible with some of the required packages Python requirement:
    - tensorflow-io-gcs-filesystem requires Python >=3.7, <3.11, so it will not be satisfied for Python >=3.11,<4.0
    - tensorflow-io-gcs-filesystem requires Python >=3.7, <3.11, so it will not be satisfied for Python >=3.11,<4.0
  
  Because no versions of tensorflow-io-gcs-filesystem match >0.23.1,<0.24.0 || >0.24.0
   and tensorflow-io-gcs-filesystem (0.23.1) requires Python >=3.7, <3.11, tensorflow-io-gcs-filesystem is forbidden.
  And because tensorflow-io-gcs-filesystem (0.24.0) requires Python >=3.7, <3.11, tensorflow-io-gcs-filesystem is forbidden.
  Because no versions of tensorflow match >2.8.0,<3.0.0
   and tensorflow (2.8.0) depends on tensorflow-io-gcs-filesystem (>=0.23.1), tensorflow (>=2.8.0,<3.0.0) requires tensorflow-io-gcs-filesystem (>=0.23.1).
  Thus, tensorflow is forbidden.
  So, because poetry-example depends on tensorflow (^2.8.0), version solving failed.

[... backtrace omitted here ...]

  • Check your dependencies Python requirement: The Python requirement can be specified via the `python` or `markers` properties
    
    For tensorflow-io-gcs-filesystem, a possible solution would be to set the `python` property to ">=3.9,<3.11"
    For tensorflow-io-gcs-filesystem, a possible solution would be to set the `python` property to ">=3.9,<3.11"

    https://python-poetry.org/docs/dependency-specification/#python-restricted-dependencies,
    https://python-poetry.org/docs/dependency-specification/#using-environment-markers

real    0m1.286s
user    0m0.907s
sys 0m0.413s

Well, it was fast, but that’s because it failed. This has been an ongoing problem for some of the projects I need to set up in Poetry (often, related to either Tensorflow or Numpy)… In Poetry’s defense, they do give a very good suggestion to fix the issue - and it will fix the issue in this case. But for beginners, this might be enough frustration to simply give up. I want even a first-time user to be able to get the project started without having to fiddle with the config files for the environment manager.

Today I looked at PDM.

PDM is a modern Python package manager with PEP 582 support. It installs and manages packages in a similar way to npm that doesn’t need to create a virtualenv at all!1

I read about PEP 582, and found that it seems to be stalled in the “Draft” status, but there is also some grass-roots community support. Essentially, it tries to take package management cues from projects like NPM and bring those into the Python ecosystem at the core level. If it passes, it might finally resolve the “environments in Python are the worse” situation. I hope it manages to pass…

Anyway, here is the same example project, but using PDM to set it up:

$ pdm init -n
Creating a pyproject.toml for PDM...
Using the last selection, add '-i' to ignore it.
Using Python interpreter: ~/.pyenv/versions/3.9.9/bin/python3.9 (3.
9)
Changes are written to pyproject.toml.

$ time pdm add numpy tensorflow
Adding packages to default dependencies: numpy, tensorflow
✔ 🔒 Lock successful
Changes are written to pdm.lock.
Changes are written to pyproject.toml.
Synchronizing working set with lock file: 41 to add, 0 to update, 0 to remove

[... omitting several "Install <packagename> successful" lines here ...]

🎉 All complete!

real    2m32.090s
user    0m33.502s
sys     0m5.583s

Well, look at that! It worked. It took a while, but it figured out the dependencies.

I think I’ll try PDM for a few more things. Not only does the solver seem somewhat better (see below), but I like the idea of not needing a virtual environment (in favor of the PEP 582 way of packaging)… We’ll see how it goes after some more projects though.

Nothing is ever that easy…

I tried the same test with a few more packages. Instead of only Tensorflow and Numpy, I tried to add Tensorflow, Numpy, Pandas, matplotlib, and ipykernel to a brand new project (in a single add) with both package managers.

Poetry said:

...

  SolverProblemError

...

Exactly like before. Not surprising.

PDM said:

  [... many "successful" messages and some "failed" ones omitted here ...]
  ✔ Install zipp 3.7.0 successful
  Retry failed jobs
  ⠸ Installing h5py 2.10.0...
  ⠸ Installing tensorflow 2.2.0...
  ✖ Install scipy 1.4.1 failed
Could not find a version that satisfies the requirement tensorflow==2.2.0 (from
  ✖ Install h5py 2.10.0 failed
  ✖ Install tensorflow 2.2.0 failed
  ✖ Install scipy 1.4.1 failed

ERRORS:
[... sad and unhelpful text below omitted ...]

real    14m41.722s
user    6m2.432s
sys     0m54.376s

Well, darn. It started out so well! In this case even PDM got into trouble (I think it was the Pandas in combo with the others that did it somehow). The error messages below the ERRORS: line were also super unhelpful if you are a Python beginner (they are just a crash backtrace through the library’s codebase). And, it took a long time to do it (or… not do it). Boo, PDM. Do better.

So maybe the long Python environment manager nightmare isn’t quite over yet. But I have to say: I like some of the things PDM can do – I’m going to keep my eye on this one and hope it can become a strong recommendation soon.


  1. Intro paragraph at https://pdm.fming.dev/↩︎


Quarto Early Impressions

About a month ago I saw a post on Hacker News that led me to check out Quarto. Quarto claims to be “an open-source scientific and technical publishing system built on Pandoc”. I love writing in Markdown for both academic publications (especially during the draft process) and for course content. I write my lecture notes (destined for slides), homework assignments (either PDF or MediaWiki output), and personal notes (destined for nvAlt / Obsidian) in Markdown already. I quickly noticed that some of the same folks who develop R-Markdown are involved with Quarto, and it has backing from R-Studio. I think the R-Markdown project is doing some awesome things, so that really grabbed my attention.

I installed Quarto on January 23, so I’ve been using it for about a month now. Here are some things (good and bad) that I’ve noticed so far…

General Impressions

Quarto tries to be an end-to-end publishing experience. It works with regular Markdown format, the Pandoc flavor of Markdown, and also adds some of its own extensions in places. Mostly, it looks and feels like Pandoc markdown with Citeproc, Fignos, and Tablenos filters active. It can also understand R-Markdown and Jupyter notebooks as input (and output) formats. It can also convert between formats pretty effectively – in this way, it is doing the same thing as the Jupytext project which I have been an avid user of for a couple of years. So, with one install, you get Markdown/R-Markdown/Jupyter to «choose your favorite output format - Quarto supports practically anything», and the input formats are also supported output formats! Nice.

How I’m using Quarto (some are aspirational at the moment)

Creating Exams (pdf)
I use a very custom template for my in-class exams, so I wasn’t sure how well Quarto would adapt to it. I ran into one small issue with the YAML option margin somewhere in the Quarto pipeline clashing with the custom variable I had in my template, but it was easy to update my template to avoid the name collision. The output is now exactly the same as the old way (with raw Pandoc and a lot of flags).
Lecture Notes (pdf)
I use PDF (Beamer) lecture notes for some of my graduate courses, again with a custom template. I was pleasantly surprised to find that it all worked flawlessly just by setting the template in the YAML front matter. Easy! I haven’t used any algorithmic figure output yet, but I have experimented with it in a toy deck, and it seems that it will work fine.
Lecture Notes (HTML-via-Markdown)
I keep many of the slides for my undergraduate online as a website hosted on Github Pages, and generated by Hugo with Remark.js (see my repo here). Quarto is capable of generating Reveal.js output, but actually my system only requires Hugo-compatible Markdown with all the assets (images, etc) in the right places. I think I can make it work with the Hugo output options Quarto offers, but I haven’t had time to experiment with this yet. I’m really not sure it would save much time/effort, but maybe the citation management and auto-generated figures would be nice… I look forward to trying it out soon.
Academic publications
TBD. I have not had a chance to start any new academic writing using Quarto yet – but I absolutely intend to start my next manuscript this way to see how far I can get.

Pros

Here are a few things that really stand out about Quarto so far:

YAML front matter support for selecting output formats and templates.
This was a big gripe of mine with respect to Pandoc alone (and a feature I love in R-Markdown). I want to be able to specify the output format and template right in the front matter. Yes, I know that personal taste in YAML-in-Markdown varies out there, but for my use case this makes a lot of sense.
Better figure positioning options
This is another big one. Positioning figures and tables in Pandoc/Markdown can be infuriating, let alone if you have multiple output format targets. Quarto brings its own extensions for doing this, and they work pretty well (at least in my short time using them so far). Not just the things you would think would be considered table stakes, like left/center/right justification… I mean, multiple figures in a row? Figures with sub-figures? Finally.
Unified preview, render, convert
Before Quarto, I used some tools that could live preview (Hugo can do this). I used Pandoc for the rendering, and for some type-conversion functions. I used Jupytext for type conversion where Jupyter Notebooks were involved… Quarto can do all of this in one tool. That is super convenient.
Page layout options
I’m not sure how much I would use this, but it is nice that you can choose to position content in the margin, or to make a section that is wider than the rest (maybe for a figure or large table). Before, it was a pain. Quarto makes it easy. (Or, they make it look easy - I haven’t had a chance to make any real use of this yet.)

Cons

No tool is perfect. This one is still new and in active development, so of course there will be some rough edges. Here are a couple that I hope are worked out soon (or eventually, at least):

.qmd extension (sometimes) required
Just what the world needs - another file extension that may be Markdown. I tend to just use .md for all my markdown files - and my editors all understand what that means. Quarto introduces .qmd to denote their own flavor of Markdown. It makes sense to avoid confusion if a plain Markdown converter doesn’t give the desired results because of the extensions - but that is also the case with e.g. Pandoc Markdown extensions, or «choose your Markdown flavor» extensions. We live with it and move on. My complaint is that Quarto requires .qmd on files that contain code that should be executed while rendering the document. Internally, they also use a slightly different syntax for these code blocks, so the renderer should be able to tell the difference anyway – why also require the extension? The extension doesn’t seem to be required for any of the other “magic” to work… Let me have my executable block cake and eat my .md extension too, Quarto – please?
YAML options are sometimes flaky
As much as I love the added functionality Quarto brings to the YAML front matter – I find some of the options are flaky at times and seem to have no effect for reasons I can’t figure out. For example, I have had no luck setting the font size with the fontsize option. I’m not sure if it is something I’m doing wrong, or whether something in Quarto is broken, or maybe it is something in my LaTeX template (and also the HTML output???). I just don’t know. I am quite impressed with how many of these options do work though, across the different output formats. It gives me hope that they will all be solid eventually (or I will figure out that it was my mistake all along).
Markdown extensions are a double-edged sword
Let me reiterate that I love some of the extensions the Quarto team have added to allow things like rational figure positioning options in Markdown. I love it. But, if Quarto ever goes away you are stuck with non-standard Markdown files that no longer render the way you want in vanilla Pandoc. It would be nice if the Quarto team could work with John MacFarlane and the Pandoc community to get some of these features into Pandoc proper. This is a minor nitpick, but I’ve had fantastic tools go away before, and it leaves a bitter taste when you lose features.

Final (early) impression and hopes for the future

Well, I can sum this up simply as: “Try Quarto – if you create a variety of scientific or academic content using Markdown, you’re going to love it!” It isn’t perfect, and there are some bugs here and there, but the team is strong and I am optimistic about this tool. I suppose I’ll have to check back in a year or two to see if I still feel the same. Until the, give Quarto a try. Let’s all work on bringing plain-text manuscript development into the academic mainstream so that we can collaborate with modern tooling like git and stop fighting “styles” and terrible equation editing and reference management in Word documents.

My biggest “wish” for Quarto at the moment would be that they drop the requirement for the .qmd extension for files that contain code blocks intended to be executed. If they are worried about the user being surprised, or wanting the blocks to not execute sometimes - why not add YAML options to turn it on/off? The custom fenced code block syntax already seems to guard against unintended behavior. Sure, you might have to make a one-pass edit to bring old .md files in line, e.g convert ```{python} to ```{.python} to prevent execution – but that’s a simple find-and-replace-all operation. As an aside: I wonder why they didn’t decide to make the “dotted” version mean “executable” instead – thus removing the most common conflict here? 1


  1. [edit] I think they view the dotted version as the more common case, so the choice makes sense. I usually just use ```python and ```{python}, but I guess I’m in the minority there. ↩︎


Dash a

Don’t forget a little dash a.

Typos are too easy. I have to laugh about this one though…

I needed to quickly add a user to a server I administer and make sure they could read/write within the web server’s document root without giving them sudo rights. It says so right there… Simple; create the user newuser and add a new group webadmin real quick to allow that group write permissions in the directory. Put the new user into the webadmin group and Bob’s your uncle. Simple.

Why not add myself to that group too, so that I don’t have to sudo to modify the directory either? We had locked the files in there to remove write permissions earlier, but if I’m setting up group permissions, I might as well take advantage of that as well. Probably should have done that a long time ago… Let me just add myself to the group:

sudo usermod -G webadmin myusername

Done and done.

Did you catch the error?     (I didn’t until it was too late.)     The correct command was:

sudo usermod -aG webadmin myusername

Yep. I just replaced all my groups (including sudo) with just one: webadmin. Sure, I can write in the document root now, but I’m not really the admin anymore. And I’m working remotely, so there is nothing I can do on this machine to fix that without going on site.1 🤦‍♂️

Sometimes it’s the easy things that get you.

-a


  1. I should mention that there is no firmware-level remote management on this particular machine, so I couldn’t just use that. ↩︎