Yt-dlp is the gold standard for that.
https://github.com/yt-dlp/yt-dlp
Tag cleanup and album art are their own beast that you’ll wanna tackle post-download, but beets is another gold standard tool that can help with that layer.
Yt-dlp is the gold standard for that.
https://github.com/yt-dlp/yt-dlp
Tag cleanup and album art are their own beast that you’ll wanna tackle post-download, but beets is another gold standard tool that can help with that layer.
No need to cargo-cult security practices here, chief. You’re not gonna get pwned by publishing your hardware specs. If you’re planning to build some kinda webapp for yourself, that’s a different story - but you have to fuck up hard to get hacked while hosting raw HTML.
Use an SSH key, disable password auth, make sure you’re firewalled (i.e. test with nmap), and call it a day.
While I’m sure there’s a pre-canned tool out there for you, if you have basic software experience (which you seem to), this is one of those times where it’s usually most efficient to hack together a dumb CGI script and call it a day.
This prompt should get you most of the way there, using your llm of choice:
Write a minimalist cgi script to help upload files to a server. Upon a GET request, serve a light page with a centered form that takes in a file and a submission code. Submission codes will be stored on individual lines of a plaintext file. Adding new codes to this file is out of scope - but the codes will be 8-char hex strings (do validate that submission strings are not empty!). The script should accept the submission as a POST, and save the file to an upload dir if the submission code is valid.
Vet the output, harden as needed, setup a systemd service to serve with busybox httpd, and optionally reverse-proxy. If you’ve done this sorta thing before, you can probably knock it out in a half hour.
This is @Shimitar@downonthestreet.eu‘s work, not mine - but it’s pretty similar to how I’d set things up:
https://wiki.gardiol.org/doku.php?id=networking%3Assh_tunnel
This is a great suggestion!
Lest anyone miss the buried lede, this approach means that traffic is pre-encrypted as it passes through the gateway VPS - so even if your VPS gets hacked, it’s way harder to steal credentials and break into the services running on your home network.
If you’re looking for sympathy, you got it. Fuck the state.
If you’re looking for solutions, use a cheap $5/mo VPS that exists purely as your gateway host. Run everything you want on your home machines, then tunnel the traffic to your gateway and reverse-proxy it there. Your data stays in your hands, you can spin up and expose new services publicly in a matter of minutes, AND your home IP isn’t vulnerable to doxxing or DoS.
Object storage is indeed a specialized filesystem in a trenchcoat.
Object storage is typically (but not always) associated with non-hierarchical key-value lookups, as opposed to the directory tree pattern most file systems use. Object storage systems are also typically (but not always) designed with sharding and distribution in mind.
I was hoping to play around with the dataset over the weekend to toy with some text-embedding techniques, but they’ve pulled the cord on the download links.
Anyone have a copy of the full archive they’re willing to share, or a magnet link?
Synology runs a proprietary OS OOTB that’s had multiple sloppy vulns exposing full remote access to users’ files. Putting your data in the hands of fuckups who have and will continue to leak it is the opposite of total control.
It’s completely trivial to store any data you want to in a cloud provider 100% securely just by piping it through openssl before uploading.
Just to throw out an easy option: if the music is well-labeled on Youtube, you can get pretty close to that full suite with just yt-dlp by using
--embed-thumbnail
as a stand-in for album art, dumping your files with an “Artist - track - album” naming structure using the--output-template
flag — then using an awk or python script as a second pass to add the artist/track/album names to each file as tags.E: and in case it isn’t self-evident, you don’t have to give yt-dlp a URL for each track; it’ll work fine with a playlist URL.