FastHTML page

• Joker_vD 4 hours ago

Every time I see "use ranges and algorithms!" examples, I am baffled that apparently, I am supposed to find

    inline double algorithm_call(std::span<double const> xs) noexcept {
        return std::accumulate(
            xs.begin(),
            xs.end(),
            0.0,
            [](double acc, double volts) {
                auto mv  = calibrated_mv(volts);
                auto err = residual(mv);
                return weighted_square(err) + acc;
        });
    }

more readable, concise, and easier on my eyes than

    inline double raw_loop(std::span<double const> xs) noexcept {
        double sum = 0.0;

        for (double volts : xs) {
            auto mv  = calibrated_mv(volts);
            auto err = residual(mv);
            sum += weighted_square(err);
        }

        return sum;
    }

Sure, there are some algorithms in <algorithms> that I'm rather not reimplement myself, but this one is not it.

• Erlangen 2 hours ago

You said "ranges and algorithms", but you didn't copy the third function which actually uses <range> library.

inline double ranges_pipeline(std::span<double const> xs) noexcept { auto costs = xs | std::views::transform(calibrated_mv) | std::views::transform(residual) | std::views::transform(weighted_square);

  return std::ranges::fold_left(costs, 0.0, std::plus<double>{});

}

It's still a bit verbose, because C++ doesn't allow universal function call syntax. It will be even more concise in other languages like D.

• Joker_vD an hour ago

That version was so much more opaque that I didn't bother copying that. Again, I'm not entirely sure why people are so enamored with splitting iteration itself from the contents of one iteration step, especially since the loops are language built-ins.

• rzzzt 4 hours ago

The first form is easier to send to 32 beefy cores or 1024 small CPUs or a Beowulf cluster or a GPU or people sitting in a room.

• Joker_vD 41 minutes ago

It's been 15 years since I've last touched OpenMP, but the second form is trivially parallelizable as well. Besides, this parallelization can only ever properly work with arrays/vectors or, at the very worst, std::deque as its usually implemented (a vector of fixed-length arrays), not with e.g. linked lists or red-black trees, so why even bother with generic spans and algorithms?

• xyzzyz 4 hours ago

Both of them have to be completely rewritten to make use of multiprocessing, so what exactly is the advantage?

• mpyne 3 hours ago

The original example isn't really using ranges except to emulate C++98 iterator work though.

The actual equivalent might be something closer to:

    inline double algorithm_call(std::span<double const> xs) noexcept {
        return std::accumulate(
            xs, 0.0,
            [](double acc, double volts) {
                auto mv  = calibrated_mv(volts);
                auto err = residual(mv);
                return weighted_square(err) + acc;
        });
    }

(that is, without the boilerplate .begin and .end).

Even that is enough to make ranges useful in my mind, but in a codebase which has started to integrate some functional programming techniques, there are also applications for things like views and transforms.

This can make it easier to reason about iteration pipelines in ways you might already be familiar with from POSIX.

That all said, it's C++ so sometimes the error messages get a lot more 'interesting' than they would have with STL-style iterators, especially when mixed with constexpr expressions as you might do with std::format or fmt libs.

• rzzzt 4 hours ago

The first one too? Isn't that the map-reduce fork-join golden example of multiprocessing?

• cwzwarich 3 hours ago

`std::accumulate` is defined to have sequential semantics, so the analysis required to make it parallel is probably not that different than starting from the loop version. I guess you could have an alternate `accumulate_associative` that uses the same interface but assumes the reduction is associative and has unspecified evaluation order?

• mpyne 3 hours ago

C++ has std::reduce for that, which is std::accumulate except it's defined to operate without any specific ordering.

• Joker_vD 38 minutes ago

And now you should probably also stop and consider whether adding elements one-by-one as opposed to recursively adding together sums of smaller subarrays has better or worse numerical behaviour in regards to e.g. rounding and stability.

• rzzzt 3 hours ago

Thanks everyone, my C++ knowledge has been greatly expanded today.

• tcfhgj 3 hours ago

1) afaik accumulate cannot be parallelized

2) the map part is included in the accumulate lambda, so the map part cannot be parallelized either -> you'd have to split it out into a transform step (iirc)

• CITIZENDOT 3 hours ago

std::accumulate is sequential and guarantes in order traversal. std::reduce is parallel version of it

• fooker 2 hours ago

Great, now use some functions. From the library or your own, and see this complexity become manageable.

That's what abstraction is about.

• chrka 2 hours ago

Don't trust your compiler. Your code is only fast if you're lucky.

https://tiki.li/blog/lucky_code.html

• charleslmunger 25 minutes ago

I agree you can't trust your compiler, but you can control its behavior more reliably with __builtin_expect_with_probability

https://github.com/protocolbuffers/protobuf/commit/9f29f02a3...

• kzrdude 5 hours ago

Trust the compiler - sure - but we can't change the whole program by using -ffast-math, unfortunately, so that particular one is out.

• CodesInChaos 3 hours ago

I like the Rust approach of adding operations like `algebraic_add` instead of supporting a compiler flag. This avoids undefined behaviour and keeps the complications from optimizations localized to code using these.

https://doc.rust-lang.org/std/primitive.f32.html#algebraic-o...

> Algebraic operators of the form a.algebraic_*(b) allow the compiler to optimize floating point operations using all the usual algebraic properties of real numbers – despite the fact that those properties do not hold on floating point numbers. This can give a great performance boost since it may unlock vectorization.

> The exact set of optimizations is unspecified but typically allows combining operations, rearranging series of operations based on mathematical properties, converting between division and reciprocal multiplication, and disregarding the sign of zero. This means that the results of elementary operations may have undefined precision, and “non-mathematical” values such as NaN, +/-Inf, or -0.0 may behave in unexpected ways, but these operations will never cause undefined behavior.

> Because of the unpredictable nature of compiler optimizations, the same inputs may produce different results even within a single program run. Unsafe code must not rely on any property of the return value for soundness. However, implementations will generally do their best to pick a reasonable tradeoff between performance and accuracy of the result.

• kstrauser 2 hours ago

I appreciate the semantics and locality of that, too. When you glance at it, you understand that specific tradeoffs are happening right here, and here only, without some CLI arg changing them for the entire program. It’s kinda like unsafe, but for math.

• CoastalCoder 5 hours ago

I really dislike the complexity of modern C++ language specs, but does it obscure much detail about FP ops?

TL;DR:

A vast majority of the programmers I've worked with don't understand the nuances of FP in general, nor the various extents of IEEE-754 support in different programming languages.

So for important numerical programming, I think clarity regarding the FP operations being performed can be crucial. I'm just unclear if modern C++ is a significant factor for that.

• mike_hock 5 hours ago

> Virtual vs static polymorphism

> std::visit over std::variant<A, B, C> is lowered to a switch over the active alternative.

> In this case, layout is probably doing more work than the dispatch mechanism itself.

Very likely because last time I checked visit lowers to a virtual call.

• mwkaufma 3 hours ago

Unremarked: debug build perf, perf-stability against minor edits, build-time bloat when heavily using std templates.

• Panzerschrek 4 hours ago

> exceptions are slow

There are proposals to introduce better exceptions into C++. Like this: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p07....

But until it's not in the standard, people should use std::expceted instead.

• Glandalf 5 hours ago

I’ve seen some terrible horrid nonsense from them and even the best compilers don’t use a third of the opcodes our modern CPUs boast of. Nobody understands the big compilers any more either, they’re all too huge. And soon AI will be “improving” hem too.

You want to see a beautiful compiler? Look at Plan 9’s compiler suite. A man could understand and even build on that.

• Someone 3 hours ago

> even the best compilers don’t use a third of the opcodes our modern CPUs boast of

That’s not necessarily an indication of the weakness of compilers. It also could be an indication that hardware designers could leave out instructions.

X86, in particular, will have lots of them for backwards compatibility reasons (extreme example: the old 80-bit x87 FP stack)

There also are instructions that are expected to never get used by ‘normal’ compilers but cannot be removed because they only make sense in lower-level code such as those for switching between protection levels, implementing compare-and-swap, etc.

• gmueckl 3 hours ago

x87 support may not be the most obscure part of the instruction set. Ther is also hardware support for BCD math in 16 bit amd 32 bit mode. Who uses that anymore?

• ks6g10 2 hours ago

Unfortunately some exchanges (twse) uses packed BCD encoding.

• bluGill 4 hours ago

How does the resulting code compared to what a modern compiler gives me. I don't maintain compilers for a living, I maintain other code, which is ultimately longer and more complex than a C++ compiler. And so if my compiler, by becoming a little bit more complex, can make my resulting code a lot simpler because I don't have to do inline optimizations of various sorts, that makes my life much easier and is a good trade-off since there's a lot more programs in the world than there are compilers.

• sylware 5 hours ago

Are you a fool?

Another name for compilers: invisible backdoor injectors. The more complex is the syntax the more it is likely to happen... I let you guess how the "sane" syntax from c++ and similar (LOL) does fit here...

• pjmlp 5 hours ago

Quite funny comment on the vibe coding age.

• galangalalgol 4 hours ago

Quit poking at the openbsd maintainers. Jokes aside (I mean maybe they are one I don't know), it is at least a coherent opinion that inherently complex but critical software infrastructure would ideally be kept as simple and understandable as possible with all the correctness and verification apparatus staying out of the way so you can see what is there to be backdoored. I use rust primarily and like using it, but there are well over a hundred crates just in the front end, and llvm isn't simple. I do miss the days when I could know what each line did.

• sylware 4 hours ago

And yours is in no way related to mine...

• pjmlp 4 hours ago

Complaining about C++ compilers given the amount of increasing vibe code garbage and related hallucinations, certainly is.

• sylware 4 hours ago

Oh!

You meant it is even worse nowadays with vibe coding! My bad.

• benj111 3 hours ago

What has complex code got to do with it?

Trusting trust was based on old C. You don't get much more minimal than that.