This project is mirrored from https://github.com/cockroachdb/cockroach.
Pull mirroring updated .
- Mar 06, 2023
-
-
Jayant Shrivastava authored
Previously, when the schema registry encountered an error when registering a schema, it would retry the request. The problem is that upon hitting an error, we clean the body before retrying. Retrying with an empty body results in a obscure error and causes the changefeed to fail. With this change, we now retry with the original request body so the original error is sustained. This change also adds verbose logging for the errors encountered when registering the schema. Release note: None Epic: None
-
- Feb 14, 2023
-
- Feb 13, 2023
-
-
Tommy Reilly authored
Backport of #92775 Fixes: #82464 Release note: SQL queries running on remote notes now show up in cpu profiles with "distsql.stmt" label.
-
Xin Hao Zhang authored
-
Xin Hao Zhang authored
This commit adds a null check when reading the uiconfig object in stmt details (which is provided in CC but not db-console) to prevent throwing an error, which made the page unusable. Epic: none Release note (bug fix): stmt details page is now able to render Release justification: bug fix
-
Faizan Qazi authored
release-22.1: sql/schemachanger: prevent concurrent declarative/legacy usage for types
-
- Feb 10, 2023
-
-
Nick Travers authored
release-22.1: backport disk-stalled roachtest changes
-
Nick Travers authored
The `fuse` variant of the `disk-stalled` roachtest was skipped in \#95865. Re-enable the skipped variant, updating it to make use of our forked version of `charybdefs`. This fork includes a patch that allows for specifying a delay time for syscalls, making it possible to simulate a complete disk stall. Previously, delay times were limited to 50ms, which meant that the detection time had to be even lower (e.g. 40ms), which was not representative of how Cockroach is configured in practice. Allow the roachprod infrastructure to interpolate strings such as `{store-dir}`, etc. when provided as `ExtraArgs` or the `KeyCmd`. Fix by expanding expanding all args, rather than just `ExtraArgs`. Fixes #95874. Release note: None.
-
Nick Travers authored
Currently, if `ExtraArgs` (a `[]string`) is specified for the start options for a cluster, and an argument in the slice contains whitespace, the argument will be split into sub-arguments. This results in situations where an argument intended to be interpreted as a literal string is split into separate arguments. E.g. ``` ExtraArgs = []{"'foo bar baz'"} // becomes "'foo" "bar" "baz'" ``` Remove the string splitting logic, instead relying on callers to specify arguments as already "pre-split". Update two existing usages of `ExtraArgs` to pre-split the arguments. Improve documentation. Release note: None.
-
Jackson Owens authored
Move the existing disk-stall/* roachtests under disk-stall/fuse/* (for the FUSE filesystem approach to stalling) and skip them for now. Currently, they're not capable of stalling the disk longer 50us (see #95886), which makes them unreliable at exercising stalls. Add two new roachtests, disk-stall/dmsetup and disk-stall/cgroup that use dmsetup and cgroup bandwidth restrctions respectively to reliably induce a write stall for an indefinite duration. Informs #94373. Epic: None Release note: None
-
- Feb 09, 2023
-
-
Rafi Shamim authored
-
Rafi Shamim authored
Release note: None
-
sumeerbhola authored
release-22.1: storage: fix bug in legacySSTIterator.SeekGE
-
- Feb 08, 2023
-
-
Rafi Shamim authored
-
Oliver Tan authored
release-22.1: roachtest/tpcc: retry prometheus query during DRT
-
Evan Wall authored
release-22.1: sql: wrap stacktraceless errors with errors.Wrap
-
Evan Wall authored
Fixes #95794 This replaces the previous attempt to add logging here #95797. The context itself cannot be augmented to add a stack trace to errors because it interferes with grpc timeout logic - gRPC compares errors directly without checking causes https://github.com/grpc/grpc-go/blob/v1.46.0/rpc_util.go#L833. Although the method signature allows it, `Context.Err()` should not be overriden to customize the error: ``` // If Done is not yet closed, Err returns nil. // If Done is closed, Err returns a non-nil error explaining why: // Canceled if the context was canceled // or DeadlineExceeded if the context's deadline passed. // After Err returns a non-nil error, successive calls to Err return the same error. Err() error ``` Additionally, a child context of the augmented context may end up being used which will circumvent the stack trace capture. This change instead wraps `errors.Wrap` in a few places that might end up helping debug the original problem: 1) Where we call `Context.Err()` directly. 2) Where gRPC returns an error after possibly calling `Context.Err()` internally or returns an error that does not have a stack trace. Release note: None
-
- Feb 07, 2023
-
-
Rafi Shamim authored
The test setup was wrong, and was always using the latest sqlalchemy. This fixes the pinning, and also updates to a newer version. Release note: None
-
Marcus Gartner authored
release-22.1: sql/logictest: fix flaky test in unique
-
Marcus Gartner authored
This commit fixes a flaky test in the `unique` logic tests. The test could flake because an `UPSERT` violated two unique constraints, making the error message non-deterministic. Fixes #95968 Release note: None
-
Nathan VanBenschoten authored
release-22.1: kvcoord: heartbeat immediately to avoid being considered expired
-
Alex Sarkesian authored
This changes the `txnHeartbeater` to modify when we start our heartbeat loop in some cases. Previously, we would start the heartbeat loop (which writes the transaction record) one heartbeat interval (default 1s) after the first request in the transaction that acquires locks. In the case that more than 5 heartbeat intervals have passed since the first read in the transaction by the time that we encounter the first locking request, however, any other operations that encounter the locks and attempt to push (before the transaction heartbeats) will consider this transaction to be expired. To avoid this situation, this changes the interceptor to heartbeat immediately if the transaction would otherwise be considered expired before its first heartbeat interval. Release note (bug fix): Fixes a race condition where some operations waiting on locks can cause the lockholder transaction to be aborted if they occur before the transaction can write its record. Release justification: Bug fix.
-
Tobias Grieger authored
-
- Feb 06, 2023
-
-
Nathan Stilwell authored
For unknown reasons[1], when publishing Cluster UI from `release-22.1` the Typescript type files are not being included in the `.tgz` file that is published to npm. Adding a wildcard to the `dist/` entry in the `files` property of the package.json seems to include them, so that change was made along with a version bump to publish Cluster UI when this change is merged. [1]: the package.json `files` property is the same on branches release-22.1 and release-22.2, as well as the `.npmignore` and the `.gitignore` files. When publishing Cluster UI from branch release-22.2 the types are included, but publishing from release-22.1 they are not. Other factors considered were npm version, node version, and dependency versions. Epic: none Release note: None
-
Jackson Owens authored
release-22.1: vendor: bump Pebble to a30d64b32b0b
-
Jackson Owens authored
``` a30d64b3 vfs: handle concurrent directory Syncs in disk-health checking 5dee4bea db: add Options.WithFSDefaults ``` Epic: None Release note (bug fix): Fix bug where a disk stall could go undetected in rare circumstances where multiple goroutines sync the data directory concurrently. Release justification: Fix severe issue of undetected disk stall.
-
Jackson Owens authored
release-22.1: cli: close listeners and all open connections on disk stall
-
Jackson Owens authored
release-22.1: vendor: bump Pebble to 10f3aff6757a
-
Jackson Owens authored
Disk stalls prevent a node from making progress. Any ranges for which the stalled node is leaseholder may also be prevented from making progress while the stalled node remains online but incapacitated. CockroachDB nodes detect stalls within their stores through timing all write filesystem operations. Previously, when a stall was detected, Cockroach would simply fatal the process. However, a process blocked on disk IO cannot be terminated. The process would enter the zombie state, but would be unable to be reaped. This commit adds a new step to disk stall handling, closing all open sockets. Epic: None Release note (bug fix): Fix a bug where a node with a disk stall would continue to accept new connections and preserve existing connections until the disk stall abated.
-
Jackson Owens authored
``` 10f3aff6 .github: install crlfmt@024b567c 287ed0f1 vfs: add SyncData,SyncTo,Preallocate to vfs.File 2c4a74ee vfs: clean up Fd functionality ``` Release note: Fix bug whereby a stalled disk would sometimes be undetected. Now the stall is detected any time a filesystem write operation is observed to last longer than the value in the storage.max_sync_duration cluster setting.
-
Tobias Grieger authored
This improves #96332 by including (as a tag) the goroutine ID under which spans are created. This allows following the trace in a Go execution trace if one is available. Epic: none Release note: None
- Feb 03, 2023
-
-
Nathan Stilwell authored
Epic: none Release note: None
-
Rafi Shamim authored
-
Miral Gadani authored
release-22.1: roachtest: allow TC_BUILDTYPE_ID to be accessible by Docker
- Feb 02, 2023
-
-
Nathan Stilwell authored
-
Nathan Stilwell authored
- Using `actions/setup-node@v3` a registry-url needs to be specified (one isn't defaulted and I'm unsure if npm publish will default to `registry.npmjs.org` by default, so better safe than sorry) as well as supplying the environment variable `NODE_AUTH_TOKEN` rather than `NPM_TOKEN` (which npm uses by default, but will be overriden by an `.npmrc`). - Adding a check for an existing tag. - Adding a "files" property to the package.json of Cluster UI to ensure that all the types are included in the publish. - ui: bumping Cluster UI version to trigger a publish Epic: none Release note: None
-
Renato Costa authored
In #81103, the process of generating TeamCity links in test failure reports started relying on the `TC_BUILDTYPE_ID` environment variable. While that variable was added to TeamCity builds, it was not being passed down to Docker where the tests actually run. As a result, links generated by the GitHub poster were broken (see, for example, #81572). This commit makes `TC_BUILDTYPE_ID` accessible by Docker for every build that was already passing `TC_BUILD_BRANCH`. This should be sufficient to cover all existing cases and more, in case having access to this variable becomes useful in the future. Release note: None