• Thorry84@feddit.nl
    link
    fedilink
    arrow-up
    66
    ·
    1 year ago

    For people interested in the difference between decompiled machine code and source code I would recommend looking at the Mario 64 Decomp project. They are attempting to turn a Mario 64 rom into source code and then back into that same rom. It’s really hard and they’ve been working on it for a long time. It’s come a long way but still isn’t done.

    https://github.com/n64decomp/sm64

      • Thorry84@feddit.nl
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        There is still some stuff that needs documenting, but the original goal of recompiling the created source code into the ROMs has been achieved. People are still actively working on it, so in that sense it’s maybe never done.

    • voxel@sopuli.xyz
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      1 year ago

      well assembly is technically “source code” and can be 1:1 translated to and from binary, excluding “syntactic sugar” stuff like macros and labels added on top.

      • 257m@lemmy.ml
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        The code is produced by the compiler but they are not the original source. To qualify as source code it needs to be in the original language it was written in and a one for one copy. Calling compiler produced assembly source code is wrong as it isn’t what the author wrote and their could be many versions of it depending on architecture.

      • Malfeasant@lemm.ee
        link
        fedilink
        arrow-up
        4
        ·
        1 year ago

        But those things you’re excluding are the most important parts of the source code…

        • 257m@lemmy.ml
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          1 year ago

          By excluded he means macro assemblers which in my mind do qualify as an actual langauge as they have more complicated syntax than instruction arg1, arg2 …

  • just_ducky_in_NH@lemmy.world
    link
    fedilink
    arrow-up
    30
    ·
    1 year ago

    Okay, boomer here, be gentle.

    So back in the ‘70s I dabbled in programming (now called “coding”, I hear). I only did higher-level languages like Fortran, Cobol, IBM Basic, but a friend had a job (at age 13!) programming in assembler. Is assembler now called assembly, or are they different?

    • fidodo@lemm.ee
      link
      fedilink
      arrow-up
      33
      ·
      1 year ago

      It’s still called programming, coding is the same thing. Assembler more commonly refers to the utility program that converts the assembly code to machine code while assembly refers to the code itself, but the term assembler code is also valid. It’s uncommon to simply call the code assembler because it would be easily confused with the utility program.

    • Thwompthwomp@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      I thought that the assembler is a specific program that translates mnemonics into the corresponding machine code. Perhaps in early computing this was done by hand so a person was the assembler (and worked in assembler), but now that is handled by software (and supports various macros). So programming in assembly would generate a stream of text that must be assembled by an assembler. (Although I have heard people refer to programming in assembler as well, just not often.)

      • lhamil64@programming.dev
        link
        fedilink
        arrow-up
        9
        ·
        1 year ago

        I hear people say “program in assembler” but IMO that’s wrong. I’d say you write the code in “assembly language” (or better yet, the actual architecture you’re using like “x86 assembly”) but you “assemble” it with an “assembler”. Kind of like how you could write a program in the “C language” and “compile” it with a “compiler”

        • amki@feddit.de
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          1 year ago

          A compiler and an assembler do wildly different things though. An assembler simply replaces mnemonics while a compiler transfers instructions to a whole other language.

          • Malfeasant@lemm.ee
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Depends on the language, really… C maps pretty closely to assembly language, it’s not as simple as one mnemonic to one machine code byte, more like tokens get mapped to sequences of machine code, a function call translates to some code that sets up a stack frame, a return tears it down…

    • Overzeetop@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      I was too young/poor to afford an assembler for my 6502 so I wore out the assembly long hand on a legal pad and then manually converted each operation to machine code.

      Needless to say my programs done this way were exceptionally simple, but it’s interesting to understand the underlying code.

    • Psythik@lemm.ee
      link
      fedilink
      arrow-up
      36
      ·
      1 year ago

      I can’t wait for AI to make a PC port of every console game ever so that we can finally stop using emulators.

      • amki@feddit.de
        link
        fedilink
        arrow-up
        20
        ·
        1 year ago

        This won’t happen in our lifetime. Not only because this is more complex than rambling vaguely correlated human speech while hallucinating half the time.

          • 257m@lemmy.ml
            link
            fedilink
            arrow-up
            12
            ·
            edit-2
            1 year ago

            That dosen’t really translate to neural nets though. There is nothing inherent about matrix multiplication that would make it good at reading code. And also computers aren’t reading code they are executing it. The hardware just reads instruction by instruction and performs that instruction it has no idea what the high level purpose of what it is doing actually is.

          • gens@programming.dev
            link
            fedilink
            arrow-up
            3
            ·
            1 year ago

            Half of programming is writing code, the other half is thinking about the problem. As i learn more about programming i feel that it is even more about solving problems.

          • amki@feddit.de
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            It’s the other way round. Code is being written to fit how a specific machine works. This is what makes Assembly so hard.

            Also there is by design no understanding required, a machine doesn’t “get” what you are trying to do it just does what is there.

            If you want a machine to understand what specific code does and modify that for another machine that is extremely hard because the machine would need to understand the semantics of the operation. It would need to “get” what you were doing which isn’t happening.

          • amki@feddit.de
            link
            fedilink
            arrow-up
            8
            ·
            1 year ago

            About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

            Don’t be surprised but about half of the time I can predict the result of a coin flip.

            I’m not saying it’s not interesting but needing custom training and an fMRI is not “an AI can read minds”

            It can see if patterns it saw previously reappear in a heavily time delayed fMRI. Looking for patterns you already know isn’t such an impressive feat Computers have done this for ages now.

            It litterally can’t read minds.

            • sfgifz@lemmy.dbzer0.com
              link
              fedilink
              arrow-up
              3
              ·
              edit-2
              1 year ago

              Later, the same participants were scanned listening to a new story or imagining telling a story and the decoder was used to generate text from brain activity alone. About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words.

              You left out the most important context about “half of the time”. Guessing what you’re thinking of by just looking at your brain activity with a 50% accuracy is a very very good achievement - it’s not pulling it out of a 1 or 0 outcome like you’re with your coin flip.

              You can pretend that the AI is useless and you’re the smartest boy in the class all you want, doesn’t negate the accomplishments.

              • amki@feddit.de
                link
                fedilink
                arrow-up
                1
                ·
                1 year ago

                Being close (and “sometimes” precise) to the intended meaning is an equally useless metric to measure performance.

                Depending on what you allow for “well close enough I think” asking ChatGPT to tell a story without any reading of fMRI would get you to these results. Especially if you know beforehand it’s gonna be a story told.

        • secret301@sh.itjust.works
          link
          fedilink
          arrow-up
          4
          ·
          1 year ago

          I think it’ll be in our lifetime just not anytime soon. I feel like AI is gonna boom like the internet did. Didn’t happen overnight and not even in a year but over 35ish years

        • GBU_28@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          ·
          1 year ago

          Off the shelf models do this, yes.

          Sophisticated local trained models on expensive private hardware are already dunking on publicly available versions. The problem of hallucination is generally resolved in those contexts

          • amki@feddit.de
            link
            fedilink
            arrow-up
            5
            ·
            1 year ago

            Sure but until I see such a thing I chose not to believe in fairy tales.

            Decompiling arbitrary architecture machine code is quite a few levels above everything I’ve seen so far which is generally pretty basic pattern recognition paired with statistics and training reinforcement.

            I’d argue decompiling arbitrary machine code into either another machine code or legible higher level code is in a whol other league than what AO has proven to be capable of.

            Especially because with this being 90% accurate is useless.

            • GBU_28@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              ·
              1 year ago

              Again you aren’t seeing this because these models are being developed for private enterprise purposes.

              Regarding deep machine code analysis, sure, that’s gonna take work but the whole hallucination thing is an off the shelf, rookie problem these days

              • Rikudou_Sage@lemmings.world
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                It’s not, though. Hallucinations are inherent to the technology, it’s not a matter of training. Good training can greatly reduce the likelihood, but cannot solve it.

          • sacredfire@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Why does a pre-trained model need expensive private hardware after it was trained, other than to handle API requests faster? Is Open AI training chat-GPT on inferior hardware compared to these sophisticated private versions you mentioned?

            • GBU_28@lemm.ee
              link
              fedilink
              English
              arrow-up
              3
              ·
              1 year ago

              The fine tuning, while much more efficient than starting fresh, can still be a large amount of work.

              Then consider that your target corpus of data may also be large.

              Then consider to do your reasoning tasks across that corpus also takes strong hardware to get production ready response times.

              No, openai isn’t using inferior hardware, but their model goals, token chunking strategies and overall corpus are generalist in nature.

              There are then processing strategies teams are using to go beyond the “memory” limitations gpt 4 has, that provide massive benefits to coherency, essentially anti hallucination and better overall reasoning

        • SnipingNinja@slrpnk.net
          link
          fedilink
          arrow-up
          3
          ·
          1 year ago

          Idk the specifics, but what you say makes it sound like it would be easier to create an AI that recreates a game based on gameplay visuals (and the relevant controls)

          • amki@feddit.de
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            That game would still not work because there is a ton of hidden state in all but the simplest computer games that you cannot tell from just playing through the game normally.

            An AI could probably reinvent flappy birds because there is no more depth than what is currently on screen but that’s about it.

    • perviouslyiner@lemm.ee
      link
      fedilink
      English
      arrow-up
      10
      ·
      edit-2
      1 year ago

      It was a staple of Asimov’s books that while trying to predict decisions of the robot brain, nobody in that world ever understood how they fundamentally worked.

      He said that while the first few generations were programmed by humans, everything since that was programmed by the previous generation of programs.

      This leads us to Asimov’s world in which nobody is even remotely capable of creating programs that violate the assumptions built into the first iteration of these systems - are we at that point now?

      • amki@feddit.de
        link
        fedilink
        arrow-up
        9
        ·
        1 year ago

        No. Programs cannot reprogram themselves in a useful way and are very very far from it.

        • legion02@lemmy.world
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          Eh, I’d say continuous training models are pretty close to this. Adapting to changing conditions and new input is kinda what they’re for.

          • Bjornir@programming.dev
            link
            fedilink
            arrow-up
            1
            ·
            1 year ago

            Very far from reprogramming though. The general shape of the NN doesn’t change, you won’t get a NN made to process images to suddenly process code just by training it.

  • Southern Wolf@pawb.social
    link
    fedilink
    arrow-up
    26
    ·
    1 year ago

    It’s honestly remarkable how few people in the comments here seem to get the joke.

    Never stop dissecting things, y’all.

  • oldfart@lemm.ee
    link
    fedilink
    arrow-up
    19
    ·
    1 year ago

    IDA Pro (a disassembler) is closed source but came with a license that allowed disassembly and binary modification. Unfortunately, that’s no longer the case.

  • kamen@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    1 year ago

    Joke aside, that’s kind of like claiming that any web frontend is open source because you can access the built, minified and often obfuscated source of it.

    • Jocker Black@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      So true! I have been “hacking” some chrome extensions recently, do you know of a tool for reverse engineering JS?

  • over_clox@lemmy.world
    link
    fedilink
    arrow-up
    13
    ·
    1 year ago

    If you wanna skip a few inconvenient instructions in X86 assembly, throw a few No Operation instructions in the right places.

    NOP = 0x90

    • SzethFriendOfNimi@lemmy.world
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      1 year ago

      And so you add a hashing check. But then that can be removed.

      So you need one in the OS but that can be removed.

      So you need one in hardware.

      In other words no matter how clever you are there’s always a way to monkey with something unless you have absolute control from silicon on up.

      Here’s a really interesting video the Xbox team did on the challenges of trying to make sure that the content running wasn’t pirated.

      https://youtu.be/U7VwtOrwceo

      While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

      • grue@lemmy.ml
        link
        fedilink
        arrow-up
        5
        ·
        1 year ago

        While DRM is the bane of everybody there are cases where trust and integrity is important and it’s an intriguing look into how hard it is to manage.

        Nah, when the user wants to ensure trust and integrity in his own system, it works just fine. The problem comes when the user who needs to be able to access the data is simultaneously the adversary who needs to be stopped from accessing the data.

        In other words, it’s one of those situations where the fact that it’s hard to manage is a gigantic clue that it’s wrongheaded to try to do so in the first place.

        • SzethFriendOfNimi@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          I agree. I mean when doing secure channel communications or weapons systems or health biometrics.

          There are cases where you need to be sure of the integrity of the data and environment

  • 🇰 🌀 🇱 🇦 🇳 🇦 🇰 ℹ️@yiffit.net
    link
    fedilink
    English
    arrow-up
    9
    ·
    edit-2
    1 year ago

    I’ve wondered: Can you go deeper than assembly and code in straight binary, or does it even really matter because you’d be writing the assembly in binary anyway or what? In probably a less stupid way of putting it: Can you go deeper than assembly in terms of talking to the hardware and possibly just flip the transistors manually?

    Even simpler: How do you one up someone who codes in assembly? Can you?

    • ylph@lemmy.world
      link
      fedilink
      arrow-up
      16
      ·
      edit-2
      1 year ago

      The first computer I used was a PDP-8 clone, which was a very primitive machine by today’s standards - it only had 4k words of RAM (hand-made magnetic core memory !) - you could actually do simple programming tasks (such as short sequences of code to load software from paper tape) by entering machine code directly into memory by flipping mechanical switches on the front panel of the machine for individual bits (for data and memory addresses)

      You could also write assembly code on paper, and then convert it into machine code by hand, and manually punch the resulting code sequence onto paper tape to then load into the machine (we had a manual paper punching device for this purpose)

      Even with only 4k words of RAM, there were actually multiple assemblers and even compilers and interpreters available for the PDP-8 (FOCAL, FORTRAN, PASCAL, BASIC) - we only had a teletype interface (that printed output on paper), no monitor/terminal, so editing code on the machine itself was challenging, although there was a line editor which you could use, generally to enter programs you wrote on paper beforehand.

      Writing assembly code is not actually the same as writing straight machine code - assemblers actually do provide a very useful layer of abstraction, such as function calls, symbolic addressing, variables, etc. - instead of having to always specify memory locations, you could use names to refer to jump points/loops, variables, functions, etc. - the assembler would then convert those into specific addresses as needed, so a small change of code or data structures wouldn’t require huge manual process of recalculating all the memory locations as a result, it’s all done automatically by the assembler.

      So yeah, writing assembly code is still a lot easier than writing direct machine code - even when assembling by hand, you would generally start with assembly code, and just do the extra work that an assembler would do, but by hand.

    • CaptainBuckleroy@lemm.ee
      link
      fedilink
      arrow-up
      14
      ·
      1 year ago

      Yes, you can code in machine code. I did it as part of my CS Degree. In our textbook was the manual for the particular ARM processor we coded for, that had every processor-specific command. We did that for a few of the early projects in the course, then moved onto Assembly, then C.

    • doggle@lemmy.dbzer0.com
      link
      fedilink
      arrow-up
      11
      ·
      1 year ago

      Assembly effectively is coding in binary. Been a long time since I’ve looked at it, but you’d basically just be recreating the basic assembly commands anyway.

      I guess you could try flipping individual transistors with a magnet or an electron gun or something if you really want to make things difficult.

      If you actually want to one-up assembly coders, then you can try designing your own processor on breadboard and writing your own machine code. Not a lot of easy ways to get into that, but there’s a couple of turbo dorks on YouTube. Or you could just try reading the RISC-V specification.

      But even then, you’re following in someone else’s tracks. I’ve never seen someone try silicon micro-lithography in the home lab, so there’s an idea. Or you could always try to beat the big corps to the punch on quantum computing.

    • gerryflap@feddit.nl
      link
      fedilink
      arrow-up
      8
      ·
      1 year ago

      You can code in binary, but the only thing you’d be doing is frustrating yourself. We did it in the first week of computer science at the university. Assembly is basically just a human readable form of those instructions. Instead of some opcode in binary you can at least write “add”, which makes it easier to see what’s going on. The binary machine code is not some totally other language than what is written in the assembly code, so writing in binary doesn’t really provide any more control or benefit as far as I’m aware.

    • jadero@programming.dev
      link
      fedilink
      arrow-up
      6
      ·
      1 year ago

      All those assembly language instructions are just mnemonics for the actual opcodes. IIRC, on the 6502 processor family, JSR (Jump to SubRoutine) was hex 20, decimal 32. So going deeper would be really limited to not having access to the various amenities provided by assembler software and writing the memory directly. For example:

      I started programming using a VIC-20. It came with BASIC, but you could have larger programs if you used assembly. I couldn’t afford the assembler cartridge, so I POKED the decimal values of everything directly to memory. I ended up memorizing some of the more common opcodes. (I don’t know why I was working in decimal instead of hex. Maybe the text representation was, on average, smaller because there was no need of a hex symbol. Whatever, it doesn’t matter…)

      VIC-BASIC had direct memory access via PEEK (retrieve value) and POKE (set value). It also had READ and DATA statements. READ retrieved values from the comma-delimited list of values following the DATA statement (usually just a big blob of values as the last line of your program).

      I would write my program as a long comma-delimited list of decimal values in a DATA statement, READ and POKE those values in a loop, then execute the resulting program. For small programs, I just saved everything as that BASIC program. For larger programs, I wrote those decimal values to tape, then read them into memory. That let me do a kind of modular programming by loading common functions from tape instead of retyping them.

      I was in the process of writing my own assembler so that I could use the mnemonics directly when I got my Apple //c. More memory and the availability of quite a few high level languages derailed me and I haven’t touched assembly since.

    • mrpants@midwest.social
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      Re: Coding in binary. It makes no difference. Your assembly is binary, just represented in a more human readable form when writing it in assembly.

      Re: Manual interaction. Sure there’s plenty of old computers where you can flip switches to input instructions or manipulate registers (memory on the cpu). But this is not much different from using assembly instructions except you’re doing it live.

      You can also create purpose built processors which might be what you mean? Generally this isn’t too useful but sometimes it is. FPGAs are an example of doing this type of thing but using software to do the programming of the processor.

    • Lifted_lowered@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      You could like make a simple accumulator machine out of logic gates and enter binary instructions expressed in hexadecimal into its register to program it, yeah, but it’s not capable of all the operations of a computer. But yes the first programming was just op codes, switches flipped or punch cards, there was no assembly language. But assembly language is pretty much just mnemonics for operations and registers. Like I had to write a couple C programs in school and use GNU C compiler to disassemble them into x86 assembly and see what it was doing on that level, then we “wrote” some x86 assembly by copypasting a lot of instructions but its not that hard to make something that works in like x86 assembly or like Jasmin (Java virtual machine assembly language) if it’s simple enough.