mjr00 4 hours ago

Doing anything like this in __init__ is crazy. Even `Path("config.json").read_text()` in a constructor isn't a good idea.

Friends don't let friends build complicated constructors that can fail; this is a huge violation of the Principle of Least Astonishment. If you require external resources like a zeromq socket, use connect/open/close/etc methods (and a contextmanager, probably). If you need configuration, create a separate function that parses the configuration, then returns the object.

I appreciate the author's circumstances may have not allowed them to make these changes, but it'd drive me nuts leaving this as-is.

  • rcxdude 2 hours ago

    Why? An object encapsulates some state. If it doesn't do anything unless ypu call some other method on it first, it should just happen in the constructor. Otherwise you've got one object that's actually two types: the object half-initialised and the object fully initialised, and it's very easy to confuse the two. Especially in python there's basically no situation where you're going to need that half-state for some language restriction.

    • hdjrudni an hour ago

      > Otherwise you've got one object that's actually two types: the object half-initialised and the object fully initialised, and it's very easy to confuse the two.

      You said it yourself -- if you feel like you have two objects, then literally use two objects if you need to split it up that way, FooBarBuilder and FooBar. That way FooBar is always fully built, and if FooBarBuilder needs to do funky black magic, it can do so in `build()` instead of `__init__`.

      • exe34 21 minutes ago

        no thanks, that's how we end up with FactoryFactory. If the work needs to be done upon startup, then it needs to be done. if it is done in response to a later event, then it been done later.

    • mjr00 2 hours ago

      It's a lot easier to reason about code if I don't need to worry that something as simple as

          my_variable = MyType()
      
      might be calling out to a database with invalid credentials, establishing a network connection which may fail to connect, or reading a file from the filesystem which might not exist.

      You are correct that you don't want an object that can be half-initialized. In that case, any external resources necessary should be allocated before the constructor is called. This is particularly true if the external resources must be closed after use. You can see my other comment[0] in this thread for a full example, but the short answer is use context managers; this is the Pythonic way to do RAII.

      [0] https://news.ycombinator.com/item?id=43736940

    • banthar 2 hours ago

      Python context managers somewhat require it. You have to create an object on which you can call `__enter__`.

      • jakewins 3 minutes ago

        This is the exact opposite? They explicitly encapsulate doing resource-opening in the __enter__ method call, and then returning the opened resource encapsulated inside an object.

        Nothing about the contract encourages doing anything fallible in __init__

      • ptsneves an hour ago

        It is a tragedy that python almost got some form or RAII but then figured out an object has 2 stages of usage.

        I also strongly disagree constructors cannot fail. An object that is not usable should fail fast and stop code flow the earliest possible. Fail early is a good thing.

  • jhji7i77l an hour ago

    > Even `Path("config.json").read_text()` in a constructor isn't a good idea.

    If that call is necessary to ensure that the instance has a good/known internal state, I absolutely think it belongs in __init__ (whether called directly or indirectly via a method).

    • mjr00 an hour ago

      You're right that consistent internal state is important, but you can accomplish this with

          class MyClass:
              def __init__(self, config: str):
                  self._config = config
      
      And if your reaction is "that just means something else needs to call Path("config.json").read_text()", you're absolutely right! It's separation of concerns; let some other method deal with the possibility that `config.json` isn't accessible. In the real world, you'd presumably want even more specific checks that specific config values were present in the JSON file, and your constructor would look more like

          def __init__(self, host: str, port: int):
      
      and you'd validate that those values are present in the same place that loads the JSON.

      This simple change makes code so much more readable and testable.

      • asplake 31 minutes ago

        Better still, pass the parsed config. Let the app decide how it is configured, and let it deal with any problems.

  • echelon 3 hours ago

    Not just Python. Most languages with constructors behave badly if setup fails: C++ (especially), Java, JavaScript. Complicated constructors are a nuisance and a danger.

    Rust is the language I'm familiar with that does this exceptionally well (although I'm sure there are others). It's strictly because there are no constructors. Constructors are not special language constructs, and any method can function in that way. So you pay attention to the function signature just like everywhere else: return Result<Self, T> explicitly, heed async/await, etc. A constructor is no different than a static helper method in typical other languages.

    new Thing() with fragility is vastly inferior to new_thing() or Thing::static_method_constructor() without the submarine defects.

    Enforced tree-style inheritance is also weird after experiencing a traits-based OO system where types don't have to fit onto a tree. You're free to pick behaviors at will. Multi-inheritance was a hack that wanted desperately to deliver what traits do, but it just made things more complicated and tightly coupled. I think that's what people hate most about "OOP", not the fact that data and methods are coupled. It's the weird shoehorning onto this inexplicable hierarchy requirement.

    I hope more languages in the future abandon constructors and strict tree-style and/or multi-inheritance. It's something existing languages could bolt on as well. Loose coupling with the same behavior as ordinary functions is so much easier to reason about. These things are so dated now and are best left in the 60's and 70's from whence they came.

    • rcxdude 2 hours ago

      The annoying thing is you can actually just use the rust solution in C++ as well (at least for initial construction), but basically no-one does.

    • hdjrudni an hour ago

      I don't have enough experience with traits, but they also sound like a recipe for creating a mess. I find anything more than like 1 level of inheritance starts to create trouble. But perhaps that's the magic of traits? Instead of creating deep stacks, you mix-and-match all your traits on your leaf class?

      • echelon 42 minutes ago

        > I find anything more than like 1 level of inheritance starts to create trouble.

        That's the beauty of traits (or "type classes"). They're behaviors and they don't require thinking in terms of inheritance. Think interfaces instead.

        If you want your structure or object to print debug information when logged to the console, you custom implement or auto-derive a "Debug" trait.

        If you want your structure or object to implement copies, you write your own or auto-derive a "Clone" trait. You can control whether they're shallow or deep if you want.

        If you want your structure or object to be convertible to JSON or TOML or some other format, you implement or auto-derive "Serialize". Or to get the opposite behavior of hydrating from strings, the "Deserialize" trait.

        If you're building a custom GUI application, and you want your widget to be a button, you implement or auto-derive something perhaps called "Button". You don't have to shoehorn this into some kind of "GObject > Widget > Button" kind of craziness.

        You can take just what you need and keep everything flat.

        Here's a graphical argument I've laid out: https://imgur.com/a/bmdUV3H

    • whatevaa an hour ago

      In C#, it is also not a good idea to have constructors with failable io in them.

    • fouronnes3 3 hours ago

      How would you do traits in Python?

      • vaylian 42 minutes ago

        Python has a trait-like system. It's called "abstract base classes": https://docs.python.org/3/library/abc.html

        The abstract base class should use the @abstractmethod decorator for methods that the implementing class needs to implement itself.

        Obviously, you can also use abstract base classes in other ways, but they can be used as a way to define traits/interfaces for classes.

      • echelon 3 hours ago

        I'd really need to think long and hard about it, but my initial feeling is that we'd attach them to data classes or a similar new construct. I don't think you'd want to reason about the blast radius with ordinary classes. Granted, that's more language complexity, creates two unequal systems, and makes much more to reason about. There's a lot to juggle.

        As much fun as putting a PEP together might be, I don't think I have the bandwidth to do so. I would really like to see traits in Python, though.

zokier 4 hours ago

Pretty simple to fix by changing the _init to something like:

    def _init(self):
        init_complete = threading.Event()
        def worker_thread_start():
            FooWidget.__init__(self)
            init_complete.set()
            self.run()

        worker_thread = Thread(target=worker_thread_start, daemon=True)
        worker_thread.start()
        init_complete.wait()

Spawning worker thread from constructor is not that crazy, but you want to make sure the constructing is completed by the time you return from the constructor.
  • greatgib 4 hours ago

    To be clear, a good init is not supposed to create a thread or do any execution that might not be instant of things like that.

    It would have been better to an addition start, run, exec method that does that kind of things. Even if for usage it is an inch more complicated to use with an additional call.

    • lmm 4 hours ago

      > It would have been better to an addition start, run, exec method that does that kind of things. Even if for usage it is an inch more complicated to use with an additional call.

      That breaks RAII - you shouldn't give the user a way to create the object in an invalid state. Although if it's intended to be used as a context manager anyway then maybe doing it on context enter rather than on init would be nicer.

      • greatgib 2 hours ago

        Doesn't necessarily have to be in an invalid state. You can have your object with a none value for the socket or a state like "not_started" and the different method will look at that to handle everything properly.

        Also, it was not supposed to be used in a context manager as there is just a close method but it is the caller that decided to wrap it in a context.

        But as what you suggest, indeed it would have been a great idea to add an __enter__ and do the starting there are as it would be the proper way.

    • crdrost 3 hours ago

      I don't think it's totally crazy to, say, open a DB connection in __init__() even though that's not an instantaneous operation. That's not a hill I would die on, you can just say “open a connection and hand it to me as an argument,” but it just looks a little cleaner to me if the lifecycle of the connection is being handled inside this DB-operations class. (You can also defer creating the connection until it is actually used, or require an explicit call to connect it, but then you are also writing a bunch of boilerplate to make the type checker happy for cases where the class is having its methods called before it was properly connected.)

      • mjr00 3 hours ago

        It's not totally crazy in that I see it all the time, but it's one of the two most common things I've found make Python code difficult to reason about.[0] After all, if you open a DB connection in __init__() -- how do you close it? This isn't C++ where we can tie that to a destructor. I've run into so many Python codebases that do this and have tons of unclosed connections as a result.

        A much cleaner way (IMO) to do this is use context managers that have explicit lifecycles, so something like this:

            @contextmanager
            def create_db_client(host: str, port: int) -> Generator[_DbClient, None, None]:
                try:
                    connection = mydblib.connect(host, port)
                    client = _DbClient(connection)
                    yield client
                finally:
                    connection.close()
        
        
            class _DbClient:
                def __init__(self, connection):
                   self._connection = connection
                
                def do_thing_that_requires_connection(...):
                   ...
        
        Which lets you write client code that looks like

            with create_db_client('localhost', 5432) as db_client:  # port 3306 if you're a degenerate
                db_client.do_thing_that_requires_connection(...)
        
        This gives you type safety, connection safety, has minimal boilerplate for client code, and ensures the connection is created and disposed of properly. Obviously in larger codebases there's some more nuances, and you might want to implement a `typing.Protocol` for `_DbClient` that lets you pass it around, but IMO the general idea is much better than initializing a connection to a DB, ZeroMQ socket, gRPC client, etc in __init__.

        [0] The second is performing "heavy", potentially failing operations outside of functions and classes, which can cause failures when importing modules.

        • lijok 19 minutes ago

          > This isn't C++ where we can tie that to a destructor

          `def __del__`

          • mjr00 15 minutes ago

            C++ destructors are deterministic. Relying on a nondeterministic GC call to run __del__ is not good code.

            Also worth noting that the Python spec does not say __del__ must be called, only that it may be called after all references are deleted. So, no, you can't tie it to __del__.

    • jhji7i77l an hour ago

      > To be clear, a good init is not supposed to create a thread or do any execution that might not be instant of things like that.

      Maybe it indirectly ensures that only one thread is created per instantiation; there are better ways of achieving that though.

    • eptcyka 2 hours ago

      What if the object is in an invalid state unless the thread is started?

      • greatgib 2 hours ago

        Nothing prevents you to have a valid "not_yet_started" state.

        • rcxdude 2 hours ago

          No, but it does complicate things substantially. I don't understand why you would have a useless half-constructed state, especially in python where init isn't all that magical.

        • eptcyka 30 minutes ago

          And what would you do with this object after construction? Call start() on it immediately?

  • ohr 4 hours ago

    Thanks, I hate it! Jk, but would you consider this a good solution overall?

    • shiandow 2 hours ago

      I can't come up with a single good reason why you would wish to use inheritance in this case.

      Except possibly for type checking, but

      1. Python has duck typing anyway

      2. It's debatable whether these two classes should be interchangeable

      3. You shouldn't need to use inheritance just to make the type checking work.

      It could be considered a clever hack in some situations, but it's completely unsurprising that it causes issues. Putting band-aids on it after you find a bug does not fix the real problem.

    • im3w1l 2 hours ago

      Not him, but I would also consider making FooBarWidget have a FooWidget rather than be a FooWidget.

      Normally with a FooWidget you can create one on some thread and then perform operations on it on that thread. But in the case of the FooBarWidget you can not do operations on it because operations must be done in the special thread that is inaccessible.

wodenokoto 4 hours ago

> solving an annoying problem with a complete and utter disregard to sanity, common sense, and the feelings of other human beings.

While I enjoy comments like these (and the article overall!), they stand stronger if followed by an example of a solution that regards sanity, common sense and the feelings of others.

  • crdrost 3 hours ago

    So in this case that would be, a FooBarWidget is not a subclass of FooWidget but maybe still a subclass of AbstractWidget above that. It contains a thread and config as its state variables. That thread instantiates a FooWidget with the saved config, and runs it, and finally closes its queue.

    The problem still occurs, because you have to define what it means to close a FooBarWidget and I don't think python Thread has a “throw an exception into this thread” method. Just setting the should_exit property, has the same problem as the post! The thread might still be initing the object and any attempt to tweak across threads could sometimes tweak before init is complete because init does I/O. But once you are there, this is just a tweak of the FooWidget code. FooWidget could respond to a lock, a semaphore, a queue of requests, any number of things, to be told to shut down.

    In fact, Python has a nice built-in module called asyncio, which implements tasks, and tasks can be canceled and other such things like that, probably you just wanted to move the foowidget code into a task. (See @jackdied's great talk “Stop Writing Classes” for more on this general approach. It's not about asyncio specifically, rather it's just about how the moment we start throwing classes into our code, we start to think about things that are not just solving the problem in front of us, and solving the problem in front of us could be done with just simple structures from the standard library.)

pjot 3 hours ago

Rather than juggling your parent’s __init__ on another thread, it’s usually clearer to:

1. Keep all of your object–initialization in the main thread (i.e. call super().__init__() synchronously).

2. Defer any ZMQ socket creation that you actually use in the background thread into the thread itself.

  • ledauphin 2 hours ago

    yeah, this just feels like one of those things that's begging for a lazy (on first use) initialization. if you can't share or transfer the socket between threads in the first place, then your code will definitionally not be planning to use the object in the main thread.

_Algernon_ 4 hours ago

fix: add time.sleep(0.1) in the .close method.

  • gnfargbl 4 hours ago

        # NOTE: doubled this to 0.2s because of some weird test failures.
  • Const-me 3 hours ago

    Using sleep instead of proper thread synchronization is unreliable. Some people have slow computers like a cheap tablet or free tier cloud VM. Other people have insufficient memory for the software they run and their OS swaps.

    When I need something like that, in my destructor I usually ask worker thread to shut down voluntarily (using a special posted message, manual reset event, or an atomic flag), then wait for the worker thread to quit within a reasonable timeout. If it didn’t quit within the timeout, I log a warning message if I can.

    • crdrost 3 hours ago

      I believe _Algernon_ was making a joke about how you could take this problem and make the code even worse, not proposing a solution to the problem.

      It's also not even a problem with slow computers or insufficient memory, __init__ does I/O here, it connects to ZeroMQ, so it could have arbitrary latency in various circumstances exceeding the 100 milliseconds that we would be sleeping for. So the joke is, this fixes the problem in your test environment where everything is squeaky clean and you know that ZeroMQ is reachable, and now you have bugs in prod still.

    • btown 3 hours ago

      Notably on this point, resource contention on e.g. a Kubernetes cluster can cause these kinds of things to fail spuriously. Sleeps have their uses, but assuming that work is done while you’re sleeping is rarely a good idea, because when it rains it will pour.

smitty1e 5 hours ago

To paraphrase Wheeler/Lampson: "All problems in python can be solved by another level of indiscretion."

  • jerf 4 hours ago

    Oh, that's good. That has some depth to it. I'm going to have to remember that one. Of course one can switch out "Python" for whatever the local situation is too.

devrandoom 3 hours ago

That's Python's constructor, innit?