RISC OS Open
Safeguarding the past, present and future of RISC OS for everyone
ROOL
Home | News | Downloads | Bugs | Bounties | Forum | Documents | Photos | Contact us
Account
Forums → Bounties →

SMB2/3 protocol

Subscribe to SMB2/3 protocol 92 posts, 23 voices

Posts per page:

Pages: 1 2 3 4

 
Jul 4, 2023 6:56pm
Avatar Paolo Fabio Zaino (28) 1634 posts

You know this for a fact

Fact. At this point it has been calcoulated that 64% of the entire internet traffic it’s “bot” (aka crawlers, scanners, also malware and spam), not human generated. (read full stop, aka not an opinion).

Crawling and web scrapping are techniques you can at this point use even to build your own web analyzers, few libraries to reinforce the concept with data:

In Golang:

http://go-colly.org/

In Rust:

https://github.com/mattsse/voyager

There are obviously libraries in every language, included Python, JS, C++, you name it.

Crawlers can obviously be specialised in seeking for code, news, specific topics like political activism etc.

Bots can be designed to connect to socials and chat systems, this includes also the old school stuff like IRC, but also Discord etc.

Obviously there are also more commerical tools like Maltego, that can gater information across the internet searching for whatever is the topic you are interested into. For instance the RISC OS Search engine (which keeps apearing and disapearing) is made through data collection techniques and crawling.

Datasets for AI can be built from every resource that can be defined as “reliable”, where in this case reliable means the code presented works, hence it’s a good source of examples.

ARM Assembly is still a valuable source, for instance, not just for ChatGPT amenities, but also to find malware in executables etc. So not all AI training is done to trigger Rick’s posts on here ;)

You mean the part where RHEL is now closed source?

One more misinformed horn in the orchestra… (facepalm).

Here is the response of the guy who started this and his apologies for that mistake: https://www.jeffgeerling.com/blog/2023/i-was-wrong

RedHat is not breaching the GPL, end of the story. They are now part of IBM and IBM profit’s levels are most likely higher than a company that originally was tryign to be disruptive on the market, not to mention IBM has to cover the ROI from aquiring RH (which was a very expensive acquisition). So the reason behind their changes are purely economical, they can’t break the GPL, becasue if they do, they are the ones who will lose the sources. Hope this makes sense, if it doesn’t read again the youtuber’s apologies.

You know, this entire post, from the AI to RHEL demonstrates what happens when the corporate mentality meets open source.

This is true, can’t argue with that. But the original post was for “safe” public repositories of code, the answer to this is, if you are on a cruciade against AI (or more generally against people taking your code from whatever public repository, included your own website), then DON"T publish it at all :)

As others have mentioned, you can still use git (which has nothing to do with all of the above) and run gitlab on your own premise, without exposing it to crawlers.

 
Jul 4, 2023 7:02pm
Avatar Stuart Swales (8827) 1075 posts

The blank sections are “fill in yourself depending on the hardware”. Make of that what you will…

At least the blank sections don’t contain utterly duff code!

 
Jul 4, 2023 7:04pm
Avatar Paolo Fabio Zaino (28) 1634 posts

Release notes are something different (more human readable for a start). ;)

Agreed, this is an example of how to generate them automatically from git on github (but the action should work on gitlab as well):


# .github/release.yml

changelog:
  exclude:
    labels:
      - ignore-for-release
    authors:
      - your-name
  categories:
    - title: Breaking Changes 
      labels:
        - Semver-Major
        - breaking-change
    - title: Exciting New Features 
      labels:
        - Semver-Minor
        - enhancement
    - title: Other Changes
      labels:
        - "*"

There are also pre-built actions, so even easier to use:

https://github.com/marketplace/actions/release-notes-generator

In general everything that is possible done with legacy tools, can also be done on tools like github and gitlab. The automation side can also be implemented on legacy systems, via some scripting and similar.

 
Jul 4, 2023 7:09pm
Avatar Paolo Fabio Zaino (28) 1634 posts

For what it’s worth, I asked ChatGPT about setting up an MEMC1a for an A5000 with 4MB of memory.
It suggested the following code:

That code looks suspiciously like an old U-BOOT patch I have seen somewhere else, just with MEMC define value different… it’s GNU ASM, so possibly a mix of ARM code / patches with values that may come from old Archimedes documentation. If so, technically isn’t copying, or not a worst copyign that a human doing the same…

But again, if AI datasets crawlers are a problem for you, there is only one thing to do, keep your code fully private.

Now, I think we have gone really off topic very very much and I hope this discussion can get back on the SMB2/3 work :(

P.S. my apologies, I have contributed to move it even more off topic.

 
Jul 4, 2023 7:13pm
Avatar Rick Murray (539) 13048 posts

RedHat is not breaching the GPL, end of the story.

Given that I pretty much said that, exactly how am I a misinformed horn?

 
Jul 4, 2023 8:08pm
Avatar Paolo Fabio Zaino (28) 1634 posts

Given that I pretty much said that, exactly how am I a misinformed horn?

Sorry, then I misunderstood what you meant.

 
Jul 4, 2023 10:18pm
Avatar Steve Fryatt (216) 1933 posts

I admit I have not looked too hard, but visual diff utility would be a big help. Seeing the code changes in a ROOL GitLab MR is really easy-to-read.

There’s Harriet Bazley’s SideDiff, although I appear to have a more recent version than the one available for download!

Under RISC OS, I don’t know how to achieve that with my copy/paste of directories.

The problem is that it isn’t as easy to do with a copy/paste of directories as it is with a proper VCS. That’s the reason why people started to write the things in the first place… :-)

 
Jul 4, 2023 10:28pm
Avatar Jake Hamby (8915) 21 posts

Dave, thanks for the suggestion to look at Mbed TLS. I didn’t know about that possibility, and it does look like it could be smaller and simpler. The most unusual requirement of SMB 3 is to use a specific “NIST 800-108 section 5.1” key derivation algorithm, which I’ve just now learned from looking at the Samba source file that implements smb2_key_derivation(), smb2_signing.c is implemented with a series of SHA-256 hash operations, which I can do myself.

The benefit of OpenSSL/LibreSSL is it has a plug-in architecture that lets you type commands like:

openssl kdf -keylen 10 -kdfopt digest:SHA2-256 -kdfopt key:secret -kdfopt salt:salt -kdfopt info:label HKDF

And it eventually, I see now, would look up in its plug-in architecture the same sequence of SHA-256 hashes as Samba just implements inline, using GnuTLS in Samba’s case. That’s very interesting: I would’ve guessed that Samba would have more than one option for encryption libraries.

For the past few hours, I’ve been pondering the difference between two Wireshark captures of downloading the same PDF scan of an Acorn User magazine using the built-in Python 3 web server ("python3 -m http.server") to figure out exactly what makes !NetSurf and nearly every other browser, package manager, etc. so much slower than they ought to be. 1.5 MB/s over a gigabit link makes no sense, but once I saw what’s going on, it makes perfect sense.

Replacing or upgrading the TCP/IP stack won’t solve the underlying issue with extremely slow download speeds, which is that the TCP window fills up while the sender (Web server) is waiting for ACK responses, which arrive invariably about 10 ms later. Now what runs at 100 Hz in RISC OS? That’s right: TickerV. Basically, the download is so fast that the TCP/IP client app, at least the GUI ones, can’t catch up without calling socketread() much more responsively than they’re doing. They’re probably giving up CPU time just before they would have received new data, but they also can’t run in a busy loop using 100% CPU.

If my diagnosis is correct, and I think the evidence from the packet capture (TCP window full then waiting 0.01 sec for an ACK before sending more) is fairly conclusive, then I shouldn’t have much to worry about with my socket abstraction layer for the SMB client because the combination of being driven by Internet_Event messages, non-blocking socket calls, and running I/O background threads at real-time priority, should enable the file data cache that I’ve designed already to window the file data into the 500 KB or more (for a 50 MB/s download speed) that the app has to process every OS tick in order achieve that read speed.

Since in my ideal future, the SMB2/3 client will be even faster than the onboard SD card reader of a typical SBC, and everything will just work and be fast, I’m encouraged to keep working on it because I don’t think the GUI Internet client bottleneck will apply to the code I’ve written. I’m unsure what the best strategy is for speeding up apps that aren’t written with RISC OS in mind, that don’t have custom code that creates RTSupport threads and handles Internet_Event callbacks and uses SyncLib mutexes.

I haven’t looked at the UnixLib code lately, and I remember it’s quite tricky, but I suspect there must be ways to improve select(), poll(), and the emulated UNIX processes and threads to try to wake up the process that received data (or whose TCP window is now empty to send more data, since the same issue in reverse will happen with uploads) without delaying to the next OS tick. The problem isn’t having a 100 Hz system tick, but that the OS is preferring to idle for some reason until the next tick rather than wake up the app that needs to be awoken to handle the data.

 
Jul 4, 2023 11:36pm
Avatar Chris Mahoney (1684) 2079 posts

Isn’t there something a little more recent than that? Eight releases with the same version number by my count, which is definitely “at least seven”… :-)

Bingo.

Scanning all of the code hosted on a website they own is one thing (they probably slipped it into the Ts&Cs when they bought it), but downloading a file from my site and taking the code from within? Blatant copyright infringement.

On that note, I should probably put a robots.txt on my site that disallows downloads. Of course, the next question is figuring out how to get them to delete anything they’ve already harvested, which I suspect won’t happen automatically!

 
Jul 5, 2023 6:17am
Avatar Steve Pampling (1551) 7678 posts

Basically, the download is so fast that the TCP/IP client app, at least the GUI ones, can’t catch up without calling socketread() much more responsively than they’re doing. They’re probably giving up CPU time just before they would have received new data, but they also can’t run in a busy loop using 100% CPU.

Modern RO hardware is multicore, with all except one core doing sod all.
A common feature of PCs is that, even though the Main CPU is multicore, the network interface has its own low grade processor, so the system can offload the job of data transfer from main CPU to the interface processor.

 
Jul 5, 2023 12:18pm
Avatar Paolo Fabio Zaino (28) 1634 posts

Guys,
we probably want to move the “block the AI agents from stealing my code” + other off topics spin-off to another topic?

Jake’s new update almost got lost in the usual off-topics discussions… :(

 
Jul 5, 2023 6:25pm
Avatar Steve Pampling (1551) 7678 posts

Jake’s new update almost got lost in the usual off-topics discussions… :(

I thought that pointing out that Pi hardware had cores lying idle and that every other OS seems to offload network i/o to another processor might be on topic.
Oh, well.

 
Jul 5, 2023 6:42pm
Avatar Colin Ferris (399) 1602 posts

Using a spare core to do the Printing :-)

 
Jul 5, 2023 6:57pm
Avatar Rick Murray (539) 13048 posts

Steve – yeah, but since network stuff tends to get offloaded to the network interface rather than just a different core in the main processor unit (which other OSes would be using all of already)… we’d still have n>1 cores doing sod all.

 
Jul 5, 2023 7:05pm
Avatar Rick Murray (539) 13048 posts

The problem isn’t having a 100 Hz system tick, but that the OS is preferring to idle for some reason until the next tick rather than wake up the app that needs to be awoken to handle the data.

Some reason is this – I have a poll speed monitor. It simply sits on null poll and counts how many times it gets called in a second. Well, the output is messed up as a modern machine not doing anything gets far more than 1,000 polls per second (it was written in the days of my A5000 so it only copes with three digits).

So if the desktop can do upwards of a thousand polls a second, and your switcher is working at 100ths of a second increments, there’s a lot of potential for the system to get bored and fall asleep. ;)

Did the idea of the FastTickerV (millisecond) and related FastCallAfter ever gain any traction?

 
Jul 5, 2023 8:29pm
Avatar Dave Higton (1515) 3246 posts

that the app has to process every OS tick

Why should the app wait for an OS tick to do something?

My inclination is to process everything that’s there as soon as it’s there, and only yield when either there isn’t anything there, or a small number of OS ticks have occurred, whichever is the sooner.

Rather than waiting for an OS tick and then do something, instead wait for an OS tick and do nothing :-)

 
Jul 5, 2023 9:02pm
Avatar Steve Pampling (1551) 7678 posts

Steve – yeah, but since network stuff tends to get offloaded to the network interface rather than just a different core in the main processor unit

It does? I must talk to one of those IT guys ;)

Pages: 1 2 3 4

Reply

To post replies, please first log in.

Forums → Bounties →

Search forums

Social

Follow us on and

ROOL Store

Buy RISC OS Open merchandise here, including SD cards for Raspberry Pi and more.

Donate! Why?

Help ROOL make things happen – please consider donating!

RISC OS IPR

RISC OS is an Open Source operating system owned by RISC OS Developments Ltd and licensed primarily under the Apache 2.0 license.

Description

Discussion of items in the bounty list.

Voices

  • Paolo Fabio Zaino (28)
  • Stuart Swales (8827)
  • Rick Murray (539)
  • Steve Fryatt (216)
  • Jake Hamby (8915)
  • Chris Mahoney (1684)
  • Steve Pampling (1551)
  • Colin Ferris (399)
  • Dave Higton (1515)

Options

  • Forums
  • Login
Site design © RISC OS Open Limited 2018 except where indicated
The RISC OS Open Beast theme is based on Beast's default layout

Valid XHTML 1.0  |  Valid CSS

Powered by Beast © 2006 Josh Goebel and Rick Olson
This site runs on Rails

Hosted by Arachsys