Category: Daily Report

  • Day 93 – Nocode Automation Tools

    Yesterday took a good step forward on a stealth AI project I’ve been involved with since January. It’s operating in a crowded market, but we’ve recently identified a better way to pivot. Will be able to announce it soon, once final alpha tests have been done.

    Also this week we have potentially found a new addition to our management team – a proven AI lead generation specialist. But more on that another time.

    Today I began taking another look at N8N, I’d already tried a first time and was impressed. I was able to quickly create a workflow that read a webpage, summarised it, and then pumped it to Slack as a message.

    Honestly, these automation tools WILL be part of every single company going forward. They will replace a lot of internal software platforms.

    My first aim is to go from a WordPress post to publish on:

    • X
    • LinkedIn
    • Facebook
    • Instagram

    There is already a template setup, which is what’s so powerful about this N8N stuff – most of what you want to automate has already been done.

    • Installing N8N with Docker is super straightforward
    • Look at the automation templates: https://n8n.io/workflows/
    • For local use it’s super easy to create a webhook tunnel

    https://n8n.io/workflows/3086-publish-wordpress-posts-to-social-media-x-facebook-linkedin-instagram-with-ai

    Today’s video analysis is of the guy who made Windsurf – which I haven’t used yet, but am going to give a proper try. Might prefer it to Cursor!

    Main takeaway is:

    • Sales team combines AE and forward deployed technicals

    In other news, this is a real time object identification locally that I want to try out

    https://github.com/ngxson/smolvlm-realtime-webcam

  • Day 92 – OpenAI API Structured Outputs

    For me Structured Outputs are essential when talking to an LLM. In most cases, an application needs to use data in a certain way so you want OpenAI to send back data in a specific structure, rather than having to format yourself.

    Structured Outputs (SOs) ensure an OpenAI response is structured in the way you would like.

    Prior to SOs you would use a specific prompt to get the response in the right structure.

    Parsing a Rental Property Webpage

    Let’s set a schema that will take a blurb from a Cottage Rental website and output an object …

    import { NextResponse } from 'next/server';
    import OpenAI from 'openai';
    import { zodTextFormat } from "openai/helpers/zod";
    import { z } from "zod";
    
    export async function POST(request) {
      try {
        const client = new OpenAI({
          apiKey: process.env.OPENAI_API_KEY,
          organization: process.env.OPENAI_ORG_ID,
        });
    
        const requestData = await request.json();
        const { model, input } = requestData;
    
        const PropertyListing = z.object({
          cottageName: z.string(),
          numberOfGuests: z.number(),
          numberOfBedrooms: z.number(),
          numberOfBathrooms: z.number(),
          allowsPets: z.boolean(),
          keyFeatures: z.array(z.string()),
          marketingCopy: z.string()
        });
    
        const response = await client.responses.parse({
          model: model || "gpt-4o-2024-08-06",
          input: input,
          text: {
            format: zodTextFormat(PropertyListing, "property"),
          },
        });
    
        return NextResponse.json({ 
          success: true, 
          event: response.output_parsed 
        });
      } catch (error) {
        console.error('OpenAI API Error:', error);
        return NextResponse.json(
          { success: false, error: error.message },
          { status: 500 }
        );
      }
    } 

    I can call this API crudely with something like:

        try {
          const response = await fetch('/api/openai/parse-cottage', {
            method: 'POST',
            headers: {
              'Content-Type': 'application/json',
            },
            body: JSON.stringify({
              model: "gpt-4o-2024-08-06",
              input: [
                { role: "system", content: "Extract the cottage information." },
                {
                  role: "user",
                  content: `webpage html here`
                },
              ],
            }),
          });
    
          const data = await response.json();
          
          if (!data.success) {
            throw new Error(data.error || 'Failed to parse cottage');
          }
          
          setEvent(data.event);
        } catch (err) {
          console.error('Error parsing event:', err);
          setError(err.message || 'Failed to parse cottage');
        } finally {
          setLoading(false);
        }
      };

    The response I get is

    {
      "cottageName": "Point Break",
      "numberOfGuests": 8,
      "numberOfBedrooms": 4,
      "numberOfBathrooms": 3,
      "allowsPets": true,
      "keyFeatures": [
        "Beachside location",
        "Contemporary interiors",
        "Part of The Dunes development",
        "Traffic-free location",
        "Open plan living",
        "Floor to ceiling windows",
        "Balcony with ocean views"
      ],
      "marketingCopy": "A beautifully finished beachside house over three floors with contemporary interiors to the highest standard. Part of The Dunes development located directly in front of the beach. Tucked away in an almost traffic-free location. Open plan living, dining and kitchen space on the 2nd floor, taking advantage of the breath-taking views over the golden sand dunes, beach and Atlantic Ocean. The spacious living area comprises of two large sofas, TV, DVD player, coffee table, floor to ceiling windows, patio doors opening onto the balcony."
    }

    So, what that’s done is convert some HTML into a JSON object that summarises the HTML. Useful!

  • Days 85 – 91 – Dropped into a part-time shit storm and extracting myself from it.

    Days 85 – 91 – Dropped into a part-time shit storm and extracting myself from it.

    100 days ago I decided to focus on becoming an AI startup. I gave myself a time limit (five months left) to demonstrate to myself that I could get something up and running.

    I was getting somewhat anxious about having no income coming in, so six weeks ago I took on a part-time freelance project for a company that was switching their booking system.

    I had planned for a nice ongoing earner with 1.5 days a week worth of work. Giving me plenty of time to focus on AI stuff.

    What actually happened, was I got dropped into a shitstorm.

    That said, I got them launched, and have backed out now. I won’t go into details to spare them the blushes, but did think I would write some generics about IT project management.

    Why IT projects go bad, and what to do when they do.

    Software projects can essentially have infinite scope. There is always something that can be tweaked or added. Add business management to the mix and they will demand a long list of features …

    1 – You didn’t clearly agree scope with all stakeholders at the beginning, and get them to sign off on it.

    All parties have a responsibility to be as clear as possible what the project is going to do. With money changing hands it’s absolutely essential.

    If you are just prototyping, it’s slightly different.

    You would rarely start builders on a home project without seeing some form of work blueprint.

    In my early career, I ignored scope and most of the time I got away with it. But as soon as it’s something more complex, it became essential.

    The work spec is essential because:

    • You can protect yourself when the customer asks for more than they are paying for. As soon as the customer says, ‘well I thought I was getting X, I really need X’ … you can direct them back to the initial agreement. ALWAYS build what was initially agreed, release that, and then build the new feature afterwards.
    • If done correctly, the work spec will force the thinking in the development team of HOW they are going to implement it, broadly speaking, and so the system architecture begins to form.
    • New management or developers don’t know where they are without the work specification.
    • Leads to increased probability that the cost is accurate
    • The customer is forced to think what they actually want

    Don’t mistake flowcharts for a specification. Documenting scope is a skill. Flowcharts are an essential part of the spec, but they still don’t describe what is being built.

    2 – The initial scope was wildly excessive, or the customer adds ongoing scope throughout the project (and you let them)

    Since software scope can be infinite, it’s easy for a customer to say they want everything. Writing a work spec helps against this, but if you don’t really think about the potential detail involved you’ll get the cost wrong.

    In most cases, a good developer should identify the handful of functions that actually represent the core USP of the system.

    If you are building a new product from scratch, always identify the very core of what you are trying to do and focus on that. Get those shipped and out there, and begin testing from there.

    If you are halfway through a project, the BIGGEST problem I’ve always found is that management will add onto the scope rather than goto launch. They will insist feature A needs to get done with some essential feature. This becomes a habit and the product NEVER gets launched.

    I worked on a project for two years and toward the end, management kept asking for more features … AND I LET THEM (at price)… I was naive … and eventually upper-upper management came into the project, the manager demanding the new features said ‘they haven’t finished’ … upper management got cold feet and pulled the project. We were so close to releasing a product that was perfectly timed to do well in the real world. Yet, it failed to launch because scope kept getting added.

    3 – The developer never documented the system

    This is a big one. Doesn’t seem it at the start, but eventually you need to bring other developers in… do you really want to explain repeatedly how the system works?

    Documentation should be as concise as possible. You will get egos in the team who think they demonstrate their intelligence by producing a 100 page spec that no-one is ever going to have the time to read.

    Documentation isn’t the initial project scope. A lot of programmers say that the code should explain itself. It’s correct, but also misses the point. Large codebases should be described (AI can help a huge amount now) … there’s always things that fellow developers need to know.

    4 – You are building on a system that is shit

    This is something that isn’t easily fixed. When you are working on a system that hasn’t been thought through properly; and just been cobbled together over the years… it really does become a major problem. Developers don’t want to work on it, they get stressed and they will mimic the state of the codebase by taking further shortcuts. And the problem compounds itself.

    5 – You are under-resourced

    There’s no way around it. If you don’t have the money to build your system then you will suffer. And if you are a technical founder, if you don’t have the time … then you are not going to get there either.

    You avoid this by:

    • Being very clear about what you want and don’t want
    • Your scope is realistic
    • Your work brief has identified how much it will REALISTICALLY cost by working out how to do it

    If you are deep into a project and money has run out… it’s generally because you’ve messed up further down the line. The main thing you can do is give equity away to a decent developer and ask them to work at minimum wage for a while on it, to bring it up to the correct level.

    Or, you just take out another credit card, but you get very clear about what you are doing or not doing.

    6 – Poor communication (too little or too much)

    The main problem with this last freelance project was that the two developers weren’t successfully communicating via emails. They were working, effectively, in different timezones … when I came in, I was able to bridge that gap and in just a few calls I had resolved blockages that had been around for quite a while.

    Emails are not the best form of communication. You really need to have developers talking to one another and sharing screens so they can properly explain their problems.

    So that was under communication which was the problem.

    Then you can get the over communication, where you get dumped with a ton of stuff that doesn’t matter … often information that should really be in a Google Doc that can be referenced. If you add to the mental overhead of a developer, the developer will have a drop in productivity. So often you just need to keep the technician focused and protected from the business decisions going on around them.

  • Day 71 – Sora !

    For today thought I would just add my first Sora video.

  • Day 63 – Are you seeing it yet?

    The generative industry is moving so fast; with images, video, music, speech, written word all now improving at a rapid place.

    Many people are not seeing it.

    Today I was able to build a web tool that may not be great for production but is really quite good for the debugging and investigation of an API. The ‘AI’ was able to understand some of my complex but no quite so explanatory prompts in a way that blew me away and was able to produce some really valuable work.

    I don’t fully know whether programming is going to die as a profession. I think *eventually* yes, it probably will be abstracted away much like assembly language has been. All the patterns will be fully understood, trained, tweaked and they will be able to replicate huge swaths of code that actually create a great product.

    When you do get into complex systems though – whilst I’m no silicon valley person – it does seem that there are limitations on it, and it must have an experienced human (developer) guiding it to get it to the last 5% of properly working.

    That’s just programming. The rest of the generative industry is equally as mind blowing.

    Are people really seeing what’s going on?

    And do they understand the magnitude?

    And are they embracing?

    Work Diary

    Laravel Herd & Filament

    Whilst I have been prototyping with Vanilla, I checked out Filament last night on Laracasts:

    Filament Forms

    The Form library alone is complete enough to warrant using Filament. It has so many small quality of life features that add up. From memory here’s a few that convinced me:

    • Really nice prefix and suffix options
    • Really nice icon and description and helper text integrations
    • Excellent search integration into long selects in one line of code
    • A really easy wizard workflow setup
    • Excellent integration of new entity creation when a user needs to be able to add a new option to a select list
    • Easy linking between SELECTS

    This is of course on top of it working solidly as you would expect.

    So, I’m flipping my prototype to use Filament … which is an excellent basis going forward.

    Filament File Uploads

    In one line, its got a drag and drop file upload, with image preview built in. Uses filecon JS

  • Day 62 – Laravel + Frontend Tooling + Random Thoughts

    Today, I was looking at making a final decision on whether to use a React or Vue frontend. My prototype has been coming along, but I had just been using plain PHP blade templates with Vanilla Javascript for quickness.

    I think web development has got totally overcomplicated, and that you can go a long way with vanilla stuff. I would never start a PHP project without Laravel however, since I know it has virtually everything I need to build whatever I want.

    Since LLMs have come about, there’s a lot to be said for just writing functionality in javascript; but ultimately you do want to benefit from a frontend framework. State management, DOM manipulation and an event bus are reasons why I want to use them. And of course, components. Was a shame the official WebComponent standard is so poor.

    React Or Vue? Or Livewire?

    The age old debate for a developer. It’s annoying quite frankly at this point to continually decide what to use. I prefer the simplicity of Vue syntax, but I often use graduates from a bootcamp that teaches React, and I have bought quite a few premium templates that React also. To be honest though, with LLMs it’s so easy to switch between.

    Then there’s LiveWire. Each time I’ve tried to get into LiveWire I’ve been put off by it. But I had the same resistance when I got into Flutter and once I persevered I loved it.

    In order to decide on the choice, I looked into Filament.

    Filament

    Filament is TALL stack (Tailwind, Alpine, Laravel, Livewire) … and i’ve used previous versions being fairly impressed. This would be a bit of a few steps back in order to take a leap forward, but taking a look at the excellent Laracasts for it, I’m starting to see the benefits.

    • The resource creation gives the CRUD UI upfront … this is slightly different from the CRUD UI generation that I have been working on… since my ones are more ‘on-the-fly’ than ones that get defined in the yaml files. Still the resource creation results in a full on CRUD interface with sortable datatables out of the box.
    • The form library is pretty much flawless. No more messing about with forms.
    • Select options can be configured with enums and gets all enforced really nicely.
    • Multi-tenant stuff out of the box
    • Can hook up with Laravel Stripe for subscriptions

    Laravel Herd

    • I’ve been a fan of Laravel Sail for a while, but that’s only because it abstracted complication away from me. Whilst I get tired of learning new things, that’s no reason to not embrace something that works even better. Enter Laravel Herd. Some benefits:
      • I don’t have to use Docker Desktop. Which decides to hog a huge amount of system memory. You can change this in settings, but overall Herd is a lot faster.
      • For some reason, my local composer would not install the laravel CLI tool to my local machine. So I had to resort to installing Laravel via some shell scripts. Herd got it installed immediately
      • Herd includes TinkerWell, some debugging tools, and general system configuration

    Consultancy Fees

    I’m getting tired of being asked to ‘look into stuff’ and expecting it to be done for free. Need to have a consultancy page hooked up to a simple credit card thing.

  • Day 61 – Prototyping

    I’ve been using my very basic prototype to store data of ‘things’ in my life – I use LLMs to setup this data; and then I have a UI that allows me to interact with the data.

    One of the things I’ve been doing with it is keeping track of projects, ideas and tasks … and the problem I started to face was I didn’t like the UI I had used as the index for these data items.

    So last week I’d spotted a nice slider effect that I liked; and today I decided to put time into my own project for once again; and just implement that.

    So now I still have a CRUD interface for my data, but I’ve also integrated the beginnings of a nice swiper interface so that I can swipe through each ‘thing’ one at a time; consider it one at a time; and then move onto the next.

    I made an animated gif to briefly showcase it. Since I’m running out of time today, I’ll leave it here … but this is currently cycling through my blueprints … it’s a good start and going to be a much more enjoyable way of cycling through the things going on.

    And of course eventually the images will be generated according to the content.

    More updates tommorow.

  • Day 58 – 60- Quick Catchup & MCP -Model Context Protocol Revolution

    It’s been a few days since I posted last. I’m going to let myself off, but resolve to do better in the future – since I do want to keep posting on a daily basis but I am aware I want to make the posts more valuable; so I feel they represent value that can then go onto LinkedIn, and the other social networks.

    Things are definitely building in my mind; and projects are coming along. However, still a long way from anything concrete taking off.

    For the moment would just like to talk about MCP, because it really is a very significant milestone in the AI world. (Image context from OpenAI was another one recently).

    MCP

    Model Context Protocol was announced in Nov 2024 by Anthropic.

    This is a big step forward since it has standardised how language models can/should interact with external tools.

    In order for LLMs to be further useful, they need to be able to action things.

    I’m guessing that once Anthropic had integrated a couple of tools with their LLMs they realised it would be better to have a standard way of doing that … and from that they came up with the MCP architecture.

    At first understanding, giving a language model a standard way of interfacing with the outside world is a bit of a game-changer and there are already a ton of integrations that we can use straight away. Things are happening really fast.

    Core Architecture

    Essentially, you have an application (called the Host), and this has a Client inside it.

    The Client maintains connection to the Servers which have access to the tools you want the LLMs to connect to.

    Protocol Layer

    It’s fairly simple … you have:

    • Requests
    • Notifications
    • Results

    You have functions that:

    • Handle incoming requests
    • Send requests and await response
    • Handle income notification
    • Send a notification

    Transport Layer

    All transport layers use JSONRPC

    JSON-RPC Request Object

    • jsonrpc : always going to be “2.0”
    • method
    • params (optional)
    • id

    Without an ID, it is considered a notification that doesnt expect a response. In fact, servers MUST NOT reply to a notification.

    JSON-RPC Response Object

    • jsonrpc (version string)
    • result
    • error
    • id

    To be clear, MCP is a standard way – a proposed specification that multiple parties can agree on – for how language models will interact with outside tools.

    Protocols are vital in tech. We have TCP-IP and email, which you use all the time. Without that agreement, we could have had an incredibly fractured internet.

    When companies and developers can agree to do things in a certain way, it makes it easier to make systems.

    Using MCP I assume is very much like working with interfaces. If you code an LLM up to work with your own tools and then for whatever reason you decide to switch LLMs … then using a protocol would mean there’s no lost work – because the new one will use exactly the same interface as the current one.

    Reference

    4 Request object

    A rpc call is represented by sending a Request object to a Server. The Request object has the following members:jsonrpcA String specifying the version of the JSON-RPC protocol. MUST be exactly “2.0”.methodA String containing the name of the method to be invoked. Method names that begin with the word rpc followed by a period character (U+002E or ASCII 46) are reserved for rpc-internal methods and extensions and MUST NOT be used for anything else.paramsA Structured value that holds the parameter values to be used during the invocation of the method. This member MAY be omitted.idAn identifier established by the Client that MUST contain a String, Number, or NULL value if included. If it is not included it is assumed to be a notification. The value SHOULD normally not be Null [1] and Numbers SHOULD NOT contain fractional parts [2]

    The Server MUST reply with the same value in the Response object if included. This member is used to correlate the context between the two objects.

    [1] The use of Null as a value for the id member in a Request object is discouraged, because this specification uses a value of Null for Responses with an unknown id. Also, because JSON-RPC 1.0 uses an id value of Null for Notifications this could cause confusion in handling.

    [2] Fractional parts may be problematic, since many decimal fractions cannot be represented exactly as binary fractions.

  • Day 57 – The Habits Ahead

    Strong results come from doing the right thing many times over.

    Consistency. Discipline.

    I’ve got multiple projects going on at the moment and they can easily fill up my days entirely, without time for the things that will necessarily continue to build the business. But I am dropping things that will actually continue to build the business.

    These are a good additional starting point to my existing habits:

    • Grant and funding research & application
    • Event, conference and networking research
    • Email marketing pipeline
    • Website marketing
    • Social media marketing

    It’s a bit like programming yourself for succeeding; you decide what actions the business would benefit from; or what areas need to be attended to consistently.

    These become the KPIs of your business.

  • Day 56 – Pandora’s Box.

    I think the way the world is right now, people need to feel like there is a new wave of ‘something good’ … and whilst AI is certainly going to take many jobs, it opens up a whole world of possibility for those who are prepared to learn about it, work hard and creatively use it. So there has been this huge positive wave of energy (certainly amped up with money) toward innovation. It may well be that LLMs have limitations, and that we are witnessing a bubble… but people are now TRYING new things. Technical and non-technical people are finding they can do a whole lot more using AI. What I mean is, a lot of crazy ideas are being unlocked. Pandora’s box has been opened.


    True Personal Assistants

    For me, it gives me the opportunity to build a personal assistant of the magnitude that i’ve wanted to few decades. I think the science fiction movies and video games influenced me in wanting to build these personal assistants.

    Examples are

    JARVIS Assistant From Iron Man Movies

    J.A.R.V.I.S., which stands for “Just A Rather Very Intelligent System,” is Tony Stark’s natural-language user interface computer system, named after Edwin Jarvis, the butler who worked for Howard Stark and the Stark household.18 Initially, J.A.R.V.I.S. was developed as a simple AI assistant to control Stark’s Iron Man suit, but it evolved into a powerful AI capable of managing various tasks and assisting Stark in his personal and superhero life.51 J.A.R.V.I.S. uses advanced natural language processing and communication skills to understand and respond to Stark’s commands, making it more than just an AI tool—it’s a trusted companion.

    So, the ‘trusted companion’ thing here is the key.

    Cortana from Halo

    As an artificial construct, Cortana has no physical form or being. Cortana speaks with a smooth female voice, and projects a holographic image of herself as a woman. Cortana is said to resemble Halsey, with a similar attitude “unchecked by military and social protocol”. In Halo: The Fall of Reach, Cortana is described as slender, with close-cropped hair and a skin hue that varies from navy blue to lavender, depending on her mood.[6]: 216  Numbers and symbols flash across her form when she is thinking.[9] Halsey sees Cortana as a teenage version of herself: smarter than her parents, always “talking, learning, and eager to share her knowledge”.[6]: 218  Cortana is described as having a sardonic sense of humor[10] and often cracks jokes or wryly comments, even during combat.[6]: 217 

    There’s a few more but that’s enough for now

    So, the key components are:

    • LLMs (Language Models)
    • Decision Trees
    • GenerativeUI
    • Automation flows
    • Data Mining & Analysis
    • Context and Object Oriented Memory
    • MCP

    In the future I imagine, the internet has pretty much been abstracted from us. Which isn’t a great thought for most of us, but it’s kind of a natural progression. Things change, for instance it was naive of me to think that I could have another 20 years of making money doing the same thing until I retire i.e programming.

    At school, I had a maths teacher who used to program in assembly. Talk about talent… in a few decades will we have any humans that know how to write in assembly? I always wondered why he didn’t still program, and it was probably because he didn’t move on.

    In the same way, those of us who understand HTML/CSS/JS on a deep level … these skills are fast becoming abstracted away with vibe coding apps. The vibe coding apps that combine with Supabase are going to get better and better; but will the prediction machine ever get so good that it can really understand programming so that it makes it super clean (probably yes, as our prompt engineering gets better to help it). Eventually, the personal assistants will replace any requirements that vibe coding creates now.

    The next big waves/industries to come along will be more widespread autonomous machines (drones, robots, cars), surveillance and wide scale sentiment monitoring; then beyond it some sort of biohacking industry specifically with longevity at the core; video games will have a resurgence in originality; so I do hope that something will replace the employment drop-off that will shortly happen.

    That’s enough thoughts for today.


    Work Updates:

    • I continue to work on a side project to bring in a small amount of money but enough to cover personal costs. It’s a fun, high potential project, and made me realise I do like working on IT dev projects, and am quickly improving at the management of them.
    • The DXP project is slowly getting to the beta launch was is great. Once that’s hit, I will share more details.
    • My own platform is coming along. Shown it to a few people and I know these people wouldn’t sugarcoat stuff, it’s super days but reasonably positive. Just got to keep going.
    • The ideas that would drive an AI Agency are coming together; but still a little way off from doing that
    • Continually trying to keep up with the industry
    • More and more aware of the necessity of marketing for everything