It was usefull during development.
From andrewrk code review comment:
In fact, Zig does not guarantee the @sizeOf structs, and so these tests are not valid.
Encountered in a recent CI run on an aarch64-windows dev kit.
Pretty sure I disabled the virus scanner but it looks like it turned
itself back on with a Windows Update.
Rather than marking the new error code as unreachable in the places
where it is unexpected, this commit makes it return `error.Unexpected`.
Zig deflate compression/decompression implementation. It supports compression and decompression of gzip, zlib and raw deflate format.
Fixes#18062.
This PR replaces current compress/gzip and compress/zlib packages. Deflate package is renamed to flate. Flate is common name for deflate/inflate where deflate is compression and inflate decompression.
There are breaking change. Methods signatures are changed because of removal of the allocator, and I also unified API for all three namespaces (flate, gzip, zlib).
Currently I put old packages under v1 namespace they are still available as compress/v1/gzip, compress/v1/zlib, compress/v1/deflate. Idea is to give users of the current API little time to postpone analyzing what they had to change. Although that rises question when it is safe to remove that v1 namespace.
Here is current API in the compress package:
```Zig
// deflate
fn compressor(allocator, writer, options) !Compressor(@TypeOf(writer))
fn Compressor(comptime WriterType) type
fn decompressor(allocator, reader, null) !Decompressor(@TypeOf(reader))
fn Decompressor(comptime ReaderType: type) type
// gzip
fn compress(allocator, writer, options) !Compress(@TypeOf(writer))
fn Compress(comptime WriterType: type) type
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// zlib
fn compressStream(allocator, writer, options) !CompressStream(@TypeOf(writer))
fn CompressStream(comptime WriterType: type) type
fn decompressStream(allocator, reader) !DecompressStream(@TypeOf(reader))
fn DecompressStream(comptime ReaderType: type) type
// xz
fn decompress(allocator: Allocator, reader: anytype) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma
fn decompress(allocator, reader) !Decompress(@TypeOf(reader))
fn Decompress(comptime ReaderType: type) type
// lzma2
fn decompress(allocator, reader, writer !void
// zstandard:
fn DecompressStream(ReaderType, options) type
fn decompressStream(allocator, reader) DecompressStream(@TypeOf(reader), .{})
struct decompress
```
The proposed naming convention:
- Compressor/Decompressor for functions which return type, like Reader/Writer/GeneralPurposeAllocator
- compressor/compressor for functions which are initializers for that type, like reader/writer/allocator
- compress/decompress for one shot operations, accepts reader/writer pair, like read/write/alloc
```Zig
/// Compress from reader and write compressed data to the writer.
fn compress(reader: anytype, writer: anytype, options: Options) !void
/// Create Compressor which outputs the writer.
fn compressor(writer: anytype, options: Options) !Compressor(@TypeOf(writer))
/// Compressor type
fn Compressor(comptime WriterType: type) type
/// Decompress from reader and write plain data to the writer.
fn decompress(reader: anytype, writer: anytype) !void
/// Create Decompressor which reads from reader.
fn decompressor(reader: anytype) Decompressor(@TypeOf(reader)
/// Decompressor type
fn Decompressor(comptime ReaderType: type) type
```
Comparing this implementation with the one we currently have in Zig's standard library (std).
Std is roughly 1.2-1.4 times slower in decompression, and 1.1-1.2 times slower in compression. Compressed sizes are pretty much same in both cases.
More resutls in [this](https://github.com/ianic/flate) repo.
This library uses static allocations for all structures, doesn't require allocator. That makes sense especially for deflate where all structures, internal buffers are allocated to the full size. Little less for inflate where we std version uses less memory by not preallocating to theoretical max size array which are usually not fully used.
For deflate this library allocates 395K while std 779K.
For inflate this library allocates 74.5K while std around 36K.
Inflate difference is because we here use 64K history instead of 32K in std.
If merged existing usage of compress gzip/zlib/deflate need some changes. Here is example with necessary changes in comments:
```Zig
const std = @import("std");
// To get this file:
// wget -nc -O war_and_peace.txt https://www.gutenberg.org/ebooks/2600.txt.utf-8
const data = @embedFile("war_and_peace.txt");
pub fn main() !void {
var gpa = std.heap.GeneralPurposeAllocator(.{}){};
defer std.debug.assert(gpa.deinit() == .ok);
const allocator = gpa.allocator();
try oldDeflate(allocator);
try new(std.compress.flate, allocator);
try oldZlib(allocator);
try new(std.compress.zlib, allocator);
try oldGzip(allocator);
try new(std.compress.gzip, allocator);
}
pub fn new(comptime pkg: type, allocator: std.mem.Allocator) !void {
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
var cmp = try pkg.compressor(buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
var dcp = pkg.decompressor(fbs.reader());
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldDeflate(allocator: std.mem.Allocator) !void {
const deflate = std.compress.v1.deflate;
// Compressor
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Remove allocator
// Rename deflate -> flate
var cmp = try deflate.compressor(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finish
cmp.deinit(); // Remove
// Decompressor
var fbs = std.io.fixedBufferStream(buf.items);
// Remove allocator and last param
// Rename deflate -> flate
// Remove try
var dcp = try deflate.decompressor(allocator, fbs.reader(), null);
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldZlib(allocator: std.mem.Allocator) !void {
const zlib = std.compress.v1.zlib;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compressStream => compressor
// Remove allocator
var cmp = try zlib.compressStream(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.finish();
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// decompressStream => decompressor
// Remove allocator
// Remove try
var dcp = try zlib.decompressStream(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
pub fn oldGzip(allocator: std.mem.Allocator) !void {
const gzip = std.compress.v1.gzip;
var buf = std.ArrayList(u8).init(allocator);
defer buf.deinit();
// Compressor
// Rename compress => compressor
// Remove allocator
var cmp = try gzip.compress(allocator, buf.writer(), .{});
_ = try cmp.write(data);
try cmp.close(); // Rename to finisho
cmp.deinit(); // Remove
var fbs = std.io.fixedBufferStream(buf.items);
// Decompressor
// Rename decompress => decompressor
// Remove allocator
// Remove try
var dcp = try gzip.decompress(allocator, fbs.reader());
defer dcp.deinit(); // Remove
const plain = try dcp.reader().readAllAlloc(allocator, std.math.maxInt(usize));
defer allocator.free(plain);
try std.testing.expectEqualSlices(u8, data, plain);
}
```
* Add `timedWait` to `std.Thread.Semaphore`
Add example to documentation of `std.Thread.Semaphore`
* Add unit test for thread semaphore timed wait
Fix missing try
* Change unit test to be simpler
* Change `timedWait()` to keep a deadline
* Change `timedWait()` to return earlier in some scenarios
* Change `timedWait()` to keep a deadline (based on std.Timer)
(similar to std.Thread.Futex)
---------
Co-authored-by: protty <45520026+kprotty@users.noreply.github.com>
* std.c: consolidate some definitions, making them share code. For
example, freebsd, dragonfly, and openbsd can all share the same
`pthread_mutex_t` definition.
* add type safety to std.c.O
- this caught a bug where mode flags were incorrectly passed as the
open flags.
* 3 fewer uses of usingnamespace keyword
* as per convention, remove purposeless field prefixes from struct field
names even if they have those prefixes in the corresponding C code.
* fix incorrect wasi libc Stat definition
* remove C definitions from incorrectly being in std.os.wasi
* make std.os.wasi definitions type safe
* go through wasi native APIs even when linking libc because the libc
APIs are problematic and wasteful
* don't expose WASI definitions in std.posix
* remove std.os.wasi.rights_t.ALL: this is a footgun. should it be all
future rights too? or only all current rights known? both are
the wrong answer.
The default logging function used to have no buffer. So a single log
statement could result in many individual write syscalls each writing
only a couple of bytes.
After this change the logging function now has a 4kb buffer. Only log
statements longer than 4kb now do multiple write syscalls.
4kb is the default bufferedWriter size and was choosen arbitrarily.
The downside of this is that the log function now allocates 4kb more
stack space but I think that is an acceptable trade-off.
The old definitions had some problems:
- In `lowerBound`, the `lhs` (left hand side) argument was passed on the
right hand side.
- In `upperBound`, the `greaterThan` function needed to return
`greaterThanOrEqual` for the function work, so either the name or the
implementation is incorrect.
To fix both problems, define the functions in terms of a `lessThan` function that returns `lhs < rhs`.
The is more consistent with the rest of `sort.zig` and it's also how C++ implements lower/upperBound (1)(2).
(1) https://en.cppreference.com/w/cpp/algorithm/lower_bound
(2) https://en.cppreference.com/w/cpp/algorithm/upper_bound
- Rewrite doc comments.
- Add a couple of more test cases.
- Add docstring for std.sort.binarySearch
Authored by https://github.com/CraigglesO
Original Discussion: #9890
I already had to create these functions and test cases for my own
project so I decided to contribute to the main code base in hopes it
would simplify my own.
I realize this is still under discussion but this was a trivial amount
of work so I thought I could help nudge the discussion towards a
decision.
Why add these to the standard library
To better illustrate and solidify their value, the standard library's
"sort" module already contains several binary search queries on arrays
such as binarySearch and internal functions binaryFirst & binaryLast. A
final example of its use: the Zig code itself created and used a
bounding search in the linker.
There still lacks the ability to allow the programmer themselves to
search the array for a rough position and find an index to read &/
update.
Adding these functions would also help to complement dynamic structures
like ArrayList with it's insert function.
Example Case
I'm building a library in Zig for GIS geometry. To store points, lines,
and polygons each 3D point is first translated into what's called an
S2CellId. This is a fancy way of saying I reduce the Earth's spherical
data into a 1D Hilbert Curve with cm precision. This gives me 2
convenient truths:
Hilbert Curves have locality of reference.
All points can be stored inside a 1D array
Since lowerBound and upperBound to find data inside a radius for
instance. If I'm interested in a specific cell at a specific "level" and
want to iterate all duplicates, equalRange is a best fit.
This change causes `__isPlatformVersionAtLeast` to no longer exist in
compiler_rt when targetting a min os version earlier than 10.15, which
is earlier than the default os version and so only affects builds that
explicitly target an older version than Zig officially supports.
Pass the required lazy dependencies from the build runner to the parent
process via a tmp file instead of stdout. I'll reproduce this comment to
explain it:
The `zig build` parent process needs a way to obtain results from the
configuration phase of the child process. In the future, the make phase
will be executed in a separate process than the configure phase, and we
can then use stdout from the configuration phase for this purpose.
However, currently, both phases are in the same process, and Run Step
provides API for making the runned subprocesses inherit stdout and stderr
which means these streams are not available for passing metadata back
to the parent.
Until make and configure phases are separated into different processes,
the strategy is to choose a temporary file name ahead of time, and then
read this file in the parent to obtain the results, in the case the child
exits with code 3.
This commit also extracts some common logic from the loop that rebuilds
the build runner so that it does not run again when the build runner is
rebuilt.
Build manifest files support `lazy: true` for dependency sections.
This causes the auto-generated dependencies.zig to have 2 more
possibilities:
1. It communicates whether a dependency is lazy or not.
2. The dependency might be acknowledged, but missing due to being lazy
and not fetched.
Lazy dependencies are not fetched by default, but if they are already
fetched then they are provided to the build script.
The build runner reports the set of missing lazy dependenices that are
required to the parent process via stdout and indicates the situation
with exit code 3.
std.Build now has a `lazyDependency` function. I'll let the doc comments
speak for themselves:
When this function is called, it means that the current build does, in
fact, require this dependency. If the dependency is already fetched, it
proceeds in the same manner as `dependency`. However if the dependency
was not fetched, then when the build script is finished running, the
build will not proceed to the make phase. Instead, the parent process
will additionally fetch all the lazy dependencies that were actually
required by running the build script, rebuild the build script, and then
run it again.
In other words, if this function returns `null` it means that the only
purpose of completing the configure phase is to find out all the other
lazy dependencies that are also required.
It is allowed to use this function for non-lazy dependencies, in which
case it will never return `null`. This allows toggling laziness via
build.zig.zon without changing build.zig logic.
The CLI for `zig build` detects this situation, but the logic for then
redoing the build process with these extra dependencies fetched is not
yet implemented.
This allows a `zig build` command to specify intention to create a
release build, regardless of what per-project options exist. It also
allows the command to specify a "preferred optimization mode", which is
chosen if the project itself does not choose one (in other words, the
project gets first choice). If neither the build command nor the project
specify a preferred release mode, an error occurs.
Before it was named "library" inconsistently.
Now the CLI args are -fsys=[name] and -fno-sys=[name] and it is a more
general-purpose "system integration" which could be a library name or
perhaps a project name such as "ffmpeg" or a binary such as "nasm".
This also makes a long-overdue change of extracting common state from
Build into a shared Graph object.
Getting the semantics right for these flags turned out to be quite
tricky. In the end it works like this:
* The override only happens when the target is fully native, with no
additional query parameters, such as versions or CPU features added.
* The override affects the resolved Target but leaves the original Query
unmodified.
* The "is native?" detection logic operates on the original, unmodified
query. This makes it possible to provide invalid host target
information, causing confusing errors to occur. Don't do that.
There are some minor breaking changes to std.Build API such as the fact
that `b.zig_exe` is now moved to `b.graph.zig_exe`, as well as a handful
of other similar flags.
This eliminates some simple usages of `usingnamespace` in the standard
library. This construct may in future be removed from the language, and
is generally an inappropriate way to formulate code. It is also
problematic for incremental compilation, which may not initially support
projects using it.
I wasn't entirely sure what the appropriate namespacing for the types in
`std.os.uefi.tables` would be, so I ofted to preserve the current
namespacing, meaning this is not a breaking change. It's possible some
of the moved types should instead be namespaced under `BootServices`
etc, but this can be a future enhancement.
Since `bufPrint` and `count` both control the writers used internally,
they can leverage type-erased writers while maintaining correct error
handling. This reduces generic instantiations when using `allocPrint`,
which calls both `count` and `bufPrint` internally.
Closes#18628
This commit splits the arguments obtained from pkg-config into two
groups, cflags and libs, and consistently applies the cflags to each
individual module linking the library while applying the libs only once
for each compilation.
`NIX_LDFLAGS` typically contains just `-rpath` and `-L`, which we already
handle. However, at least one setup hook in Nixpkgs [0] adds a linkage
directive to it. To prevent library paths from being missed (as I've
observed myself with `NIX_LDFLAGS` being `-liconv ...`, making it so that
*all* paths are missed), let's just skip over them.
[0]: 08f615eb1b/pkgs/development/libraries/libiconv/setup-hook.sh
When formatting a pointer to user type, currently it needs to be
dereferenced first, then call `formatType` on the child type.
Fix the problem by checking for "format" function on not only the type
itself, but also the struct it points to. Add hasMethod to std.meta.
Free the allocated threads in the initialization of a thread pool only with pool.join instead of additionally calling allocator.free causing free to be called twice.
Resolves#18643
Currently, std.fmt has a misguided, half-assed Unicode implementation
with an ambiguous definition of the word "character". This commit does
almost nothing to mitigate the problem, but it lets me close an open PR.
In the future I will revert 473cb1fd74 as
well as 279607cae5, and redo the whole
std.fmt API, breaking everyone's code and unfortunately causing nearly
every Zig user to have a bad day. std.fmt will go back to only dealing
in bytes, with zero Unicode awareness whatsoever. I suggest a third
party package provide Unicode functionality as well as a more advanced
text formatting function for when Unicode awareness is needed. I have
always suggested this, and I sincerely apologize for merging pull
requests that compromised my stance on this matter.
Most applications should, instead, strive to make their code independent
of Unicode, dealing strictly in encoded UTF-8 bytes, and never attempt
operations such as: substring manipulation, capitalization, alignment,
word replacement, or column number calculations.
Exceptions to this include web browsers, GUI toolkits, and terminals. If
you're not making one of these, any dependency on Unicode is probably a
bug or worse, a poor design decision.
closes#18536
The following test fails since NonCanonical is not handled
test "foo" {
std.net.Ip4Address.resolveIp("1.1.1.1", 0) catch unreachable;
}
/usr/lib/zig/std/net.zig:240:60: error: switch must handle all possibilities
if (parse(name, port)) |ip4| return ip4 else |err| switch (err) {
^~~~~~
/usr/lib/zig/std/net.zig:240:60: note: unhandled error value: 'error.NonCanonical'
referenced by:
test.foo: src/dhcp.zig:383:23
Uses the new `-M[name][=src]` CLI syntax to omit the source when the
module does not have a zig root source file.
Only some kinds of link objects imply that this should happen.
The code asserted that the range to be replaced is within bounds of
`self.items`.
This is now reflected in the doc comment.
The old, wrong doc comment was copied from the `insert*` fns.
With this assertion holding true, `start + len` is always within the
address space and `start + new_items.len` is, at this point, always
strictly within bounds of `self.items`.
These changes enable me to use `GeneralPurposeAllocator` with my "Bring
Your Own OS" package. The previous checks for a freestanding target have
been expanded to `@hasDecl` checks.
- `root.os.heap.page_allocator` is used if it exists.
- `debug.isValidMemory` only calls `os.msync` if it's supported.
Per last paragraph of RFC 8446, Section 5.2, the length of the inner content of an encrypted record must not exceed 2^14 + 1, while that of the whole encrypted record must not exceed 2^14 + 256.
Because creation of a symlink can fail on Windows with an Access Denied
error (https://learn.microsoft.com/en-us/windows/security/threat-protection/security-policy-settings/create-symbolic-links)
any tests that need a symbolic link "skip" if they run into this problem.
This change factors out a "setupSymbolicLink()" routine to make this
clearer, a bit tighter, and easier to use in future tests.
I also collapsed the "symlink in parent directory" test into the existing
"Dir.readlink" test, because the latter uses the more comprehensive
testWithAllSupportedPathTypes wrapper.
The test runner uses "." in its output between the test module and the
test name, so quote the leading '.' in these test names to make them
easier to read.
Client for tls was using a function that wasn't declared on the
interface for it. The issue wasn't apparent because net stream
implemented that function.
I changed it to keep the interface promise of what's required to be
compatible with the tls client functionality.
As suggested by @matu3ba, it can be better to use Security Attributes
directly while creating the handle instead of creating the handle then
setting the handle to inherit. Doing so can prevent potentially leaking
to other parallel spawned processes which would inherit the opened `\Device\Null`
handle.
This change also allows windows.OpenFile to handle when bInheritHandle
is set.
Note that we are using the same `saAttr`, but since it's taken as a
pointer to a const in all calls, it's never mutated, and OpenFile never alters it.
This also saves 1 kernel call for setting the handle to inherit.
This commit allows write access to the `\\Device\\Null` Handle.
Without a write access, it's not possible for the child process to write
SdOut to Null. As a requirement `SetHandleInformation` was also changed
to mark the handle as iheritable (by adding it to Flags) by the spawned process.
This allows the child to access the NUL device that was opened.
This also makes the Windows part to behave similarly to `spawnPosix`.
- Add syscall bindings/structures for the `futex2` family.
The documentation is taken from the syscall definitions.
- Add documnentation for the `cachestat` bindings and structures.
Taken from work I did in Cosmopolitian libc.
- Add binding for `map_shadow_stack`.
No documentation for this one, since the kernel devs didn't bother to
do it ¯\_(ツ)_/¯.
This syscall was added to simplify the the libc implementations of
fchmodat, as the original syscall does not take a `flags` argument.
Another syscall, `map_shadow_stack`, was also added for x86_64.
This reverts commit d9d840a33a, reversing
changes made to a04d433094.
This is not an adequate implementation of the missing safety check, as
evidenced by the changes to std.json that are reverted in this commit.
Reopens#18382Closes#18510
Itarator has `next` function, iterates over tar files. When using from
outside of module with `tar.` prefix makes more sense.
var iter = tar.iterator(reader, null);
while (try iter.next()) |file| {
...
}
Split reading/parsing tar file and writing results to the disk in two
separate steps. So we can later test parsing part without need to write
everyting to the disk.
* Specifically recognize stderr as a different concept than an error
message in Step results.
* Display it differently when only stderr occurs but the build proceeds
successfully.
closes#18473