graybeard

26

1

Reducing kernel-maintainer burnout (lwn.net)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

What is really needed, [Linus Torvalds] said, is to find ways to get away from the email patch model, which is not really working anymore. He feels that way now, even though he is "an old-school email person".

27

1

git discussion bingo (lemmy.cafe)

submitted 11 months ago by [email protected] to c/[email protected]

5 comments fedilink

Have you guys tried magit, yet? 😀

28

1

Privacy wars will be with us always. Let's set some rules (www.theregister.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

It's an opinion article, but I heavily agree with it. It's really sad that technical decisions are made by chimps who can't tell the difference between a computer and internet.

29

1

Will Linux on Itanium be saved? Absolutely not (www.theregister.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

I was not even aware there's a debate going on. Had anyone asked me before - I would've bet on Itanium having been removed from the tree ages ago!

30

1

Microsoft Holds Chip Makers' Feet To The Fire With Homegrown CPU And AI Chips (www.nextplatform.com)

submitted 11 months ago by [email protected] to c/[email protected]

0 comments fedilink

31

1

[SOLVED] Duplicate entries in Lemmy database (lemmy.cafe)

submitted 11 months ago* (last edited 11 months ago) by [email protected] to c/[email protected]

0 comments fedilink

Overview

This is a quick write up of what I had spent a few weeks trying to work out.

The adventure happened at the beginning of October, so don't blindly copy paste queries without making absolutely sure you're deleting the right stuff. Use select generously.

When connected to the DB - run \timing. It prints the time taken to execute every query - a really nice thing to get a grasp when things take longer.

I've had duplicates in instance, person, site, community, post and received_activity.

The quick gist of this is the following:

Clean up
Reindex
Full vacuum

I am now certain vacuuming is not, strictly speaking, necessary, but it makes me feel better to have all the steps I had taken written down.

\d - list tables (look at it as describe database);

\d tablename - describe table.

\o filename\ - save all output to a file on a filesystem. /tmp/query.sql` was my choice.

`instance`

You need to turn indexscan and bitmapscan off to actually get the duplicates

SET enable_indexscan = off;
SET enable_bitmapscan = off;

The following selects the dupes

SELECT
	id,
	domain,
	published,
	updated
FROM instance
WHERE
	domain IN (
		SELECT
		        domain
		FROM
		        instance
		GROUP BY domain
		HAVING COUNT(*) > 1
	)
ORDER BY domain;

Deleting without using the index is incredibly slow - turn it back on:

SET enable_indexscan = on;
SET enable_bitmapscan = on;

DELETE FROM instance WHERE id = ;

Yes, you can build a fancier query to delete all the older/newer IDs at once. No, I do not recommend it. Delete one, confirm, repeat.

At first I was deleting the newer IDs; then, after noticing the same instances were still getting new IDs I swapped to targetting the old ones. After noticing the same god damn instances still getting new duplicate IDs, I had to dig deeper and, by some sheer luck discovered that I need to reindex the database to bring it back to sanity.

Reindexing the database takes a very long time - don't do that. Instead target the table - that should not take more than a few minutes. This, of course, all depends on the size of the table, but instance is naturally going to be small.

REINDEX TABLE instance;

If reindexing succeeds - you have cleaned up the table. If not - it will yell at you with the first name that it fails on. Rinse and repeat until it's happy.

Side note - it is probably enough to only reindex the index that's failing, but at this point I wanted to ensure at least the whole table is in a good state.

Looking back - if I could redo it - I would delete the new IDs only, keeping the old ones. I have no evidence, but I think getting rid of the old IDs introduced more duplicates in other related tables down the line. At the time, of course, it was hard to tell WTF was going on and making a wrong decision was better than making no decision.

`person`

The idea is the same for all the tables with duplicates; however, I had to modify the queries a bit due to small differences.

What I did at first, and you shouldn't do:

SET enable_indexscan = off;
SET enable_bitmapscan = off;

DELETE FROM person
WHERE
	id IN (
		SELECT id
		FROM (
			SELECT id, ROW_NUMBER() OVER (PARTITION BY actor_id ORDER BY id)
			AS row_num
			FROM person) t
		WHERE t.row_num > 1 limit 1);

The issue with the above is that it, again, runs a delete without using the index. It is horrible, it is sad, it takes forever. Don't do this. Instead, split it into a select without the index and a delete with the index:

SET enable_indexscan = off;
SET enable_bitmapscan = off;

SELECT
	id, actor_id, name
FROM person a
USING person b
WHERE
	a.id > b.id
AND
	a.actor_id = b.actor_id;

SET enable_indexscan = on;
SET enable_bitmapscan = on;

DELETE FROM person WHERE id = ;

person had dupes into the thousands - I just didn't have enough time at that moment and started deleting them in batches:

DELETE FROM person WHERE id IN (1, 2, 3, ... 99);

Again - yes, it can probably all be done in one go. I hadn't, and so I'm not writing it down that way. This is where I used \o to then manipulate the output to be in batches using coreutils. You can do that, you can make the database do it for you. I'm a better shell user than an SQL user.

Reindex the table and we're good to go!

REINDEX table person;

`site`, `community` and `post`

Rinse and repeat, really. \d tablename, figure out which column is the one to use when looking for duplicates and delete-reindex-move on.

`received_activity`

This one deserves a special mention, as it had 64 million rows in the database when I was looking at it. Scanning such a table takes forever and, upon closer inspection, I realised there's nothing useful in it. It is, essentially, a log file. I don't like useless shit in my database, so instead of trying to find the duplicates, I decided to simply wipe most of it in hopes the dupes would go with it. I did it in 1 million increments, which took ~30 seconds each run on the single threaded 2GB RAM VM the database is running on. The reason for this was to keep the site running as lemmy backend starts timing out otherwise and that's not great.

Before deleting anything, though, have a look at how much storage your tables are taking up:

SELECT
	nspname                                               AS "schema",
	pg_class.relname                                      AS "table",
	pg_size_pretty(pg_total_relation_size(pg_class.oid))  AS "total_size",
	pg_size_pretty(pg_relation_size(pg_class.oid))        AS "data_size",
	pg_size_pretty(pg_indexes_size(pg_class.oid))         AS "index_size",
	pg_stat_user_tables.n_live_tup                        AS "rows",
	pg_size_pretty(
		pg_total_relation_size(pg_class.oid) /
		(pg_stat_user_tables.n_live_tup + 1)
	)                                                     AS "total_row_size",
	pg_size_pretty(
		pg_relation_size(pg_class.oid) /
		(pg_stat_user_tables.n_live_tup + 1)
	)                                                     AS "row_size"
FROM
	pg_stat_user_tables
JOIN
	pg_class
ON
	pg_stat_user_tables.relid = pg_class.oid
JOIN
	pg_catalog.pg_namespace AS ns
ON
	pg_class.relnamespace = ns.oid
ORDER BY
	pg_total_relation_size(pg_class.oid) DESC;

Get the number of rows:

SELECT COUNT(*) FORM received_activity;

Delete the rows at your own pace. You can start with a small number to get the idea of how long it takes (remember \timing? ;) ).

DELETE FROM received_activity where id &lt; 1000000;

Attention! Do let the autovacuum finish after every delete query.

I ended up leaving ~3 million rows, which at the time represented ~ 3 days of federation. I chose 3 days as that is the timeout before an instance is marked as dead if no activity comes from it.

Now it's time to reindex the table:

REINDEX TABLE received_activity;

Remember the reported size of the table? If you check your system, nothing will have changed - that is because postgres does not release freed up storage to the kernel. It makes sense under normal circumstances, but this situation is anything but.

Clean all the things!

VACUUM FULL received_activity;

Now you have reclaimed all that wasted storage to be put to better use.

In my case, the database (not the table) shrunk by ~52%!

I am now running a cronjob that deletes rows from received_activity that are older than 3 days:

DELETE FROM
	received_activity
WHERE
	published &lt; NOW() - INTERVAL '3 days';

In case you're wondering if it's safe deleting such logs from the database - Lemmy developers seem to agree here and here.

32

1

1 in 5 VMware customers plan to leave its stack next year (www.theregister.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Archive link

Good. VMware has been adding so much useless stuff it's astonishing. Anyone thinking about migrating - check OSS things out. Might not be as advanced, but do you really need all of that crap?

33

1

GhostBSD makes FreeBSD a little less frightening (www.theregister.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Archive link

I've never used any BSDs directly, only in the shape of opnSense, but as a fan of Gentoo, which uses portage, that, in turnm is heavily inspired by the ports system, I should probably give one of them a go at some point.

My biggest deterrent so far has been lower performance compared to linux. I objectively understand it's imperceptible in every day use, but something at the back of my head has been holding me back.

34

1

Software tests are not an academic exercise (www.theregister.com)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

Archive link

Another take on software testing. I he's wrong on dismissing integration testing, but it's a nice read.

35

1

Matrix-based Element plots move from Apache 2.0 to AGPLv3 (www.theregister.com)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

Archive link

It's nice to see a real example of a company doing the right thing.

Doesn't happen all that often.

EDIT: I stand corrected. This is not all that great. Not terrible, yet, but the the path is no longer clear.

36

1

Way Forward Machine (wayforward.archive.org)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Thanks, I hate it :D

37

1

[HN] Encrypted traffic interception on Hetzner and Linode targeting Jabber service (notes.valdikss.org.ru)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

cross-posted from: https://radiation.party/post/138500

[ comments | sourced from HackerNews ]

This is such a great write up! I've definitely learnt something new today!

38

1

Over 40,000 admin portal accounts use 'admin' as a password (www.bleepingcomputer.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Now do Java's keystore. changeit is a perfectly acceptable password, right? :D

39

1

OpenSSH 9.5 Release (www.openssh.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

ssh-keygen(1): generate Ed25519 keys by default. Ed25519 public keys

Finally!

ssh(1): add keystroke timing obfuscation to the client. This attempts to hide inter-keystroke timings by sending interactive traffic at fixed intervals (default: every 20ms) when there is only a small amount of data being sent. It also sends fake "chaff" keystrokes for a random interval after the last real keystroke. These are controlled by a new ssh_config ObscureKeystrokeTiming keyword.

Interesting! I wonder if there was some concept sniffer to guess the password.

40

1

Free software pioneer Stallman reveals cancer diagnosis (www.theregister.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Archive link

I've never been one to care about this sort of shit happening to famous people, but the work of RMS has provided me with so much. It has allowed me to have a hobby, it has enabled my career. I appreciate his unwavering dedication towards his ideals. I wish I could be this dedicated..

One of the comments on TheReg hit me hard:

His absolutist, no-compromise attitude to software freedom has benefited all of us, even if sadly it's earned him detractors on the way.

“The reasonable man adapts himself to the world: the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man.”

If ever that quote applied to anyone, it's RMS. I'm more on the esr side of things, but you can't deny that without RMS we probably would be more locked into walled gardens than we are now.

41

1

Lemmy password length is limited to 60 characters (lemmy.cafe)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Was testing things and ran into autofill errors with KeePassXC. Looks like the Firefox plugin manages to pass the full length of the password, even if the input field is limited to a lower number of chars. Manually pasting the password truncates it, though.

42

1

The device intelligence platform | Fingerprint (fingerprint.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Good for testing addons/settings/etc

43

1

Linux 6.6-rc1 is available now! (lkml.org)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

44

1

Fanxiang S880 2TB PCIe Gen4 NVMe SSD Review with YMTC 232L NAND (www.servethehome.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

ghostarchive.org

Does not have a DRAM cache.

The S770 and S880 have their model numbers reversed, and I wish I had reviewed them in the other order. The S770 is a better drive, with a better bundle in that it comes with a heatsink. They both suffer from wildly inaccurate thermal reporting, and they both come from Fanxiang, which is essentially a no-name brand as far as much of our readership will be concerned. With that said, both drives are fast and incredibly inexpensive. If you are in the market for fast and cheap, buy an S770. If it is out of stock, you can buy the S880!

45

1

Local governments aren't businesses so ERP doesn't work (www.theregister.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

ghostarchive.org

I'm sorry for the people affected, but there's something in me that's happy seeing Oracle fail.

As for the general point of the article - I do agree. Obviously, being a Linux enthusiast I would prefer if local gov had it's own IT with some shared requirements for interop with other councils.

46

1

IBM Software mandates return to office for those within 80km (www.theregister.com)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink

ghostarchive.org

telling those living within a 50 mile (80km) radius of a Big Blue office to be at their desks at least three days a week

This feels a bit discriminatory, but also sort of an obvious solution for those who can move or pretend to have moved to their parents' place, etc.

Big corporations still have major investments in real estate to justify to shareholders, and management at many companies prefer to see bums on seats – a phenomenon Microsoft previously termed productivity paranoia.

Happens incredible rarely, but I'm with Microsoft on this one.

In general I do see the point of mingling, especially during the probation period, when so many things are new. But the forceful, out-of-thin-air number of days in the office is daft. They could at least make it moving average over a quarter or two. Or a whole year.

47

1

Let There Be Microchips (foreignpolicy.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

ghostarchive.org

An interesting read on chip making from an unusual source.

48

1

CVE-2020-19909 is everything that is wrong with CVEs | daniel.haxx.se (daniel.haxx.se)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

Archive.today link

I've enjoyed reading the article and, after going down the rabbit whole of CVEs, NVD and the like, I now wholeheartedly support Daniel's stance. Current CVE management is stupid and broken.

49

1

UK air traffic woes caused by 'invalid flight plan data' (www.theregister.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink

https://archive.ph/kV6UO

Yay sales driven development!

50

1

Visual Studio for Mac ‘retired’: From open source, to closed source, to dead (devclass.com)

submitted 1 year ago by [email protected] to c/[email protected]

0 comments fedilink